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Chapter 1 
Introduction 



Description Logics (DLs) are used in knowledge-based systems to represent and reason 
about terminological knowledge of the application domain in a semantically well-defined 
manner. They allow the definition of complex concepts (i.e., classes, unary predicates) 
and roles (relations, binary predicates) to be built from atomic ones by the application of 
a given set of constructors. A DL system allows concept descriptions to be interrelated 
and implicit knowledge can be derived from the explicitly represented knowledge using 
inference services. 

This thesis is concerned with issues of reasoning with DLs and Guarded Logics, which 
generalise many of the good properties of DLs to a large fragment of first-order predicate 
logic. We study inference algorithms for these logics, both from the viewpoint of (worst- 
case) complexity of the algorithms and their usability in system implementations. This 
chapter gives a brief introduction to DL systems and reasoning in DLs. After that, we 
introduce the specific aspects of DLs we will be dealing with and motivate their use in 
knowledge representation. We also introduce Guarded Logics and describe why they are 
interesting from the viewpoint of DLs. We finish with an overview of the structure of this 
thesis and the results we establish. 

1.1 Description Logic Systems 

Description Logics (DLs) are logical formalisms for representing and reasoning about con- 
ceptual and terminological knowledge of a problem domain. They have evolved from 
the knowledge representation formalism of Quillian's semantic networks (1967) and Min- 
sky's frame systems (1981), as an answer to the poorly defined semantics of these for- 
malisms (Woods, 1975). Indeed, one of the distinguishing features of DLs is the well- 
defined — usually Tarski-style, extensional — semantics. DLs are based on the notions of 
concepts (classes, unary predicates) and roles (binary relations) and are mainly character- 
ized by a set of operators that allow complex concepts and roles to be built from atomic 
ones. As an example consider the following concept that describes fathers having a daughter 
whose children are all rich, using concept conjunction (n), and universal (V) and existential 
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(3) restriction over the role has_child: 

Male n 3has_child. (Female n Vhas_child.Rich) 

DL systems (Woods & Schmolze, 1992) employ DLs to represent knowledge of an 
application domain and offer inference services based on the formal semantics of the DL 
to derive implicit knowledge from the explicitly represented facts. 

In many DL systems, one can find the following components: 

• a terminological component or TBox, which uses the DL to formalise the termino- 
logical domain knowledge. Usually, such a TBox at least allows to introduce ab- 
breviations for complex concepts but also more general statements are available in 
some systems. As an example consider the following TBox that formalizes some 
knowledge about relationships of people, where _L denotes the concept with empty 
extension (the empty class): 

Parent = Human n 3has_chi Id. Human n Vhas_child. Human 
Husband = Male n 3married_to. Human 

Human = Male U Female 
Husband jZ Vmarried_to. Female 
Male n Female = _L 

The first three statements introduce Parent, Husband, and Male as abbreviations of 
more complex concepts. The fourth statement additionally requires that instances 
of Husband must satisfy Vmarried_to. Female, i.e., that a man, if married, must be 
married to a woman. Finally, the last statement expresses that the concepts Male 
and Female must be disjoint as their intersection is defined to be empty. 

• an assertional component or ABox, which formalizes (parts of) a concrete situation 
involving certain individuals. A partial description of a concrete family, e.g., might 
look as this: 

MARY : Female n Parent 
PETER : Husband 
(MARY, PETER) : has_child 

Note, that it is allowed to refer to concepts mentioned in the TBox. 

• an inference engine, which allows implicit knowledge to be derived from what has 
been explicitly stated. One typical inference service is the calculation of the sub- 
sumption hierarchy, i.e., the arrangement of the concepts that occur in the TBox 
into a quasi-order according to their specialisation/generalisation relationship. In 
our example, this service could deduce that both Male and Female are a specialisa- 
tion of (are subsumed by) Human. Another example of an inference service is instance 
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checking, i.e., determining, whether an individual of the ABox is an instance of a cer- 
tain concept. In our example, one can derive that MARY has a daughter in law (i.e., 
is an instance of 3has_child.3married_to. Female) and is an instance of -iHusband 
because the TBox axiom Male l~l Female = _L require Male and Female to be disjoint. 
We do not make a closed world assumption, i.e., assertions not present in the ABox 
are not assumed to be false by default. This makes it impossible to infer whether 
PETER is an instance of Parent or not because the ABox does not contain information 
that supports or circumstantiates this. 

kl-one (Brachman & Schmolze, 1985) is usually considered to be the first DL system. 
Its representation formalism possesses a formal semantics and the system allows for the 
specification of both terminological and assertional knowledge. The inference services pro- 
vided by KL-ONE include calculation of the subsumption hierarchy and instance checking. 
Subsequently, a number of systems has been developed that followed the general layout of 

KL-ONE. 

1.2 Reasoning in Description Logics 

To be useful for applications, a DL system must at least satisfy the following three crite- 
ria: the implemented DL must be capable of capturing an interesting proportion of the 
domain knowledge, the system must answer queries in a timely manner, and the inferences 
computed by the systems should be accurate. At least, the inferences should be sound, so 
that every drawn conclusion is correct. It is also desirable to have complete inference, so 
that every correct conclusion can be drawn. Obviously, some of these requirements stand 
in conflict, as a greater expressivity of a DL makes sound and complete inference more 
difficult or even undecidable. Consequently, theoretical research in DL has mainly focused 
on the expressivity of DLs and decidability and complexity of their inference algorithms. 

When developing such inference algorithms, one is interested in their computational 
complexity, their behaviour for "real life" problems, and, in case of incomplete algorithms, 
their "degree" of completeness. From a theoretical point of view, it is desirable to have al- 
gorithms that match the known worst-case complexity of the problem. From the viewpoint 
of the application, it is more important to have an easily implementable procedure that is 
amenable to optimizations and hence has good run-time behaviour in realistic applications. 

1.3 Expressive Description Logics 

Much research in Description Logic has been concerned with the expressivity and compu- 
tational properties of various DLs (for an overview of current issues in DL research, e.g., 
see Baader, McGuiness, Nardi, & Patel-Schneider, 2001). These investigations were often 
triggered by the provision of certain constructors in implemented systems (Nebel, 1988; 
Borgida & Patel-Schneider, 1994), or by the need for these operators in specific knowledge 
representation tasks (Baader & Hanschke, 1993; Franconi, 1994; Sattler, 1998). 
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In the following we introduce the specific features of the DLs that are considered in 
this thesis. 

1.3.1 Counting 

Since people tend to describe objects by the number of other objects they are related to 
("Cars have four wheels, spiders have eight legs, humans have one head, etc.") it does not 
come as a surprise that most DL systems offer means to capture these aspects. Number 
restrictions , which allow to specify the number of objects related via certain roles, can 
already be found in kl-one and have subsequently been present in nearly all DL systems. 
More recent systems, like FaCT (Horrocks, 1998) or iFaCT (Horrocks, 1999) also allow 
for qualifying number restrictions (Hollunder & Baader, 1991), which, additionally, state 
requirements on the related objects. Using number restrictions, it is possible, e.g., to define 
the concept of parents having at least two children (Human n (>2 has_child)), or of people 
having exactly two sisters (Humann(<2 has_sibling Female)n(>2 has_sibling Female)). 

It is not hard to see that, at least for moderately expressive DLs, reasoning with number 
restrictions is more involved than reasoning with universal or existential restrictions only, 
as number restrictions enforce interactions between role successors. The following concept 
describes humans having two daughters and two rich children but at most three children: 

Human n (>2 has_child Female) n (>2 has_child Rich) n (<3 has_child), 

which implies that at least one of the daughters must be rich. This form of interaction 
between role successors cannot be created without number restrictions and has to be dealt 
with by inference algorithms. 

Number restrictions introduce a form of local counting into DLs: for an object it is 
possible to specify the number of other objects it is related to via a given role. There are 
also approaches to augment DLs with a form of global counting. Baader, Buchheit, and 
Hollunder (1996) introduce cardinality restrictions on concepts as a terminological formal- 
ism that allows to express constraints on the number of instances that a specific concept 
may have in a domain. To stay with our family examples, using cardinality restrictions it 
is possible to express that there are at most two earliest ancestors: 

(<2 (Human n (<0 has_parent))). 

1.3.2 Transitive Roles, Role Hierarchies, and Inverse Roles 

In many applications of knowledge representation, like configuration (Wache & Kamp, 
1996; Sattler, 1996b; McGuinness & Wright, 1998), ontologies (Rector & Horrocks, 1997) or 
various applications in the database area (Calvanese, Lenzerini, & Nardi, 1998; Calvanese, 
De Giacomo, Lenzerini, Nardi, & Rosati, 1998; Calvanese, De Giacomo, & Rosati, 1999; 
Franconi, Baader, Sattler, & Vassiliadis, 2000), aggregated objects are of key importance. 
Sattler (2000) argues that transitive roles and role hierarchies provide elegant means to 
express various kinds of part-whole relations that can be used to model aggregated objects. 
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Again, to stay with our family example, it would be natural to require the has_of f spring 
or has_ancestor roles to be transitive as this corresponds to the intuitive understanding 
of these roles. Without transitivity of the role has_of f spring, the concept 

Vhas_off spring. Rich n 3has_off spring. 3has_off spring. -iRich 

that describes someone who has only rich offsprings and who has an offspring that has a 
poor offspring, would not be unsatisfiable, which is counter-intuitive 

Role hierarchies (Horrocks & Gough, 1997) provide a mean to state sub-role relationship 
between roles, e.g., to state that has_child is a sub-role of has_off spring, which makes it 
possible to infer that, e.g., a grandchild of someone with only rich offsprings must be rich. 
Role hierarchies also play an important role when modelling sub-relations of the general 
part-whole relation (Sattler, 1996a). 

Role hierarchies only allow to express an approximation of the intuitive understanding 
of the relationship between the roles has_child and has_off spring. Our intuitive under- 
standing is that has_of f spring is the transitive closure of has_child, whereas role hierar- 
chies with transitive roles are limited to state that has_of f spring is an arbitrary transitive 
super-role of has_child. Yet, this approximation is sufficient for many knowledge repre- 
sentation tasks and there is empirical evidence that it allows for faster implementations 
than inferences that support transitive closure (Horrocks, Sattler, & Tobies, 2000a). 

Above we have used the roles has_off spring and has_ancestor and the intuitive 
understanding of these roles requires them to be mutually inverse. Without the expressive 
means of inverse roles, this cannot be captured by a DL so that the concept 

-iRich n 3has_off spring. T n Vhas_of f spring.Vhas_ancestor.Rich 

that describes somebody poor who has an offspring and whose offsprings only have rich 
ancestors would not be unsatisfiable. This shortcoming of expressive power is removed by 
the introduction of inverse roles into a DL, which would allow to replace has_ancestor by 
has_of f spring -1 , which denotes the inverse of has_off spring. 

1.3.3 Nominals 

Nominals, i.e., atomic concepts referring to single individuals of the domain, are studied 
both in the area of DLs (Schaerf, 1994; Borgida & Patel-Schneider, 1994; De Giacomo 
& Lenzerini, 1996) and modal logics (Gargov & Goranko, 1993; Blackburn & Seligman, 
1995; Areces, Blackburn, & Marx, 2000). Nominals play an important role in knowledge 
representation because they allow to capture the notion of uniqueness and identity. Com- 
ing back to the ABox example from above, for a DL with nominals, the names MARY or 
PETER may not only be used in ABox assertions but can also be used in place of atomic 
concept, which, e.g., allows to describe MARY's children by the concept 3has_child _1 .MARY. 
Modeling named individuals by pairwise disjoint atomic concepts, as it is done in the DL 
system classic (Borgida & Patel-Schneider, 1994), is not adequate and leads to incorrect 
inferences. For example, if MARY does not name a single individual, it would be impossible 
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to infer that every child of MARY must be a sibling of PETER (or PETER himself), and so the 
concept 

3has_child _1 .MARY n Vhas_child~\(Vhas_child.^PETER) 

together with the example ABox would be incorrectly satisfiable. It is clear that cardinality 
restrictions on concepts can be used to express nominals and and we will see in this thesis 
that also the converse holds. 

For decision procedures, nominals cause problems because they destroy the tree model 
property of a logic, which has been proposed as an explanation for the good algorithmic be- 
haviour of modal and description logics (Vardi, 1996; Gradel, 1999c) and is often exploited 
by decision procedures. 

1.4 Guarded Logics 

The guarded fragment of first-order predicate logic, introduced by Andreka, van Benthem, 
and Nemeti (1998), is a successful attempt to transfer many good properties of modal, 
temporal, and Description Logics to a larger fragment of predicate logic. Among these 
are decidability, the finite model property, invariance under an appropriate variant of 
bisimulation, and other nice model theoretic properties (Andreka et al., 1998; Gradel, 
1999b). 

The Guarded Fragment (GF) is obtained from full first-order logic through relativiza- 
tion of quantifiers by so-called guard formulas. Every appearance of a quantifier in GF 
must be of the form 

3y(a(x, y) A 0(x, y)) or Vy(a(x, y) -> 0(x, y)), 

where q is a positive atomic formula, the guard, that contains all free variables of <fi. 
This generalizes quantification in description, modal, and temporal logic, where quantifi- 
cation is restricted to those elements reachable via some accessibility relation. For exam- 
ple, in DLs, quantification occurs in the form of existential and universal restrictions like 
Vhas_child.Rich, which expresses that those individuals reachable via the role (guarded 
by) has_child must be rich. 

By allowing for more general formulas as guards while preserving the idea of quantifi- 
cation only over elements that are close together in the model, one obtains generalisations 
of GF which are still well-behaved in the sense of GF. Most importantly, one can obtain 
the loosely guarded fragment (LGF) (van Benthem, 1997) and the clique guarded fragment 
(CGF) (Gradel, 1999a), for which decidability, invariance under clique guarded bisimula- 
tion, and some other properties have been shown by Gradel (1999a). For other extension 
of GF the picture is irregular. While GF remains decidable under the extension with fixed 
point operators (Gradel & Walukiewicz, 1999), adding counting constructs or transitivity 
statements leads to undecidability (Gradel, 1999b; Ganzinger, Meyer, & Veanes, 1999). 

Guarded fragments are of interest for the DL community because many DLs are readily 
embeddable into suitable guarded fragments. This allows the transfer of results for guarded 
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fragments to DLs. For example, Goncalves and Gradel (2000) show decidability of the 
guarded fragment /iAGFCI, into which a number of expressive DLs is easily embeddable, 
yielding decidability for these DLs. 

1.5 Outline and Structure of this Thesis 

This thesis deals with reasoning in expressive DLs and Guarded Logics. We supply a num- 
ber of novel complexity results and practical algorithms for inference problems. Generally, 
we are more interested in the algorithmic properties of the logics we study than their ap- 
plication in concrete knowledge representation tasks. Consequently, the examples given 
in this thesis tend to be terse and abstract and are biased towards computational charac- 
teristics. For more information on how to use DLs for specific knowledge representation 
tasks, e.g., refer to (Brachman, McGuinness, Patel-Schneider, Resnick, & Borgida, 1991; 
Borgida, 1995; Calvanese et al., 1998; Sattler, 2000). 
This thesis is structured as follows: 

• We start with a more formal introduction to DLs in Chapter 2. We introduce the 
standard DL ACC and define its syntax and semantics. We specify the relevant 
inference problems and show how they are interrelated. 

• Chapter 3 briefly surveys techniques employed for reasoning with DLs. We then 
describe a tableau algorithm that decides satisfiability of „4CC-concepts with optimum 
worst-case complexity (PSpace) to introduce important notions and methods for 
dealing with tableau algorithms. 

• In Chapter 4 we consider the complexity of a number of DLs that allow for qualifying 
number restrictions. The DL ACCQ is obtained from ACC by, additionally, allowing 
for qualifying number restrictions. We give a tableau algorithm that decides con- 
cept satisfiability for ACCQ,. We show how this algorithms can be modified to run in 
PSpace, which fixes the complexity of the problem as PSPACE-complete. Previously, 
the exact complexity of the problem was only known for the (unnaturally) restricted 
case of unary coding of numbers (Hollunder & Baader, 1991) and the problem was 
conjectured to be ExpTlME-hard for the unrestricted case (van der Hoek & de Rijke, 
1995). We use the methods developed for ACCQ to obtain a tableau algorithm that 
decides concept satisfiability for the DL ACCQIb, which adds expressive role expres- 
sions to ACCQ, in PSPACE, which solves an open problem from (Donini, Lenzerini, 
Nardi, & Nutt, 1997). 

We show that, for ACCQIb, reasoning w.r.t. general TBoxes and knowledge bases is 
ExpTlME-complete. This extends the known result for ACCQE (De Giacomo, 1995) 
to a more expressive DL and, unlike the proof in (De Giacomo, 1995), our proof is 
not restricted to the case of unary coding of numbers in the input. 

• The next chapter deals with the complexity of reasoning with cardinality restrictions 
on concepts. We study the complexity of the combination of the DLs ACCQ and 
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ACCQE with cardinality restrictions. These combinations can naturally be embed- 
ded into C 2 , the two variable fragment of predicate logic with counting quantifiers 
(Gradel, Otto, & Rosen, 1997), which yields decidability in NExpTime (Pacholski, 
Szwast, & Tendera, 1997) (in the case of unary coding of numbers). We show that this 
is a (worst-case) optimal solution for ACCQE, as ACCQE with cardinality restrictions 
is already NExpTlME-hard. In contrast, we show that for ACCQ with cardinality 
restrictions, all standard inferences can be solved in ExpTime. This result is ob- 
tained by giving a mutual reduction from reasoning with cardinality restrictions and 
reasoning with nominals. Based on the same reduction, we show that already con- 
cept satisfiability for ACCQE extended with nominals is NExpTlME-complete. The 
results for ACCQE can easily be generalised to ACCQEb. 

• In Chapter 6 we study DLs with transitive and inverse roles. For the DL SI — the 
extension of ACC with inverse and transitive roles — we describe a tableau algorithm 
that decides concept satisfiability in P Space, which matches the known lower bound 
for the worst-case complexity of the problem and extends Spaan's results for the 
modal logic K4 t ( 1993b). 

SI is then extended to SHEQ, a DL which, additionally, allows for role hierarchies 
and qualifying number restrictions. We determined the worst-case complexity of 
reasoning with SHEQ as ExpTlME-complete. The ExpTime upper bound has been 
an open problem so far. Moreover, we show that reasoning becomes NExpTime- 
complete if nominals are added to SHEQ. 

The algorithm used to establish the ExpTlME-bound for SHEQ employs a highly 
inefficient automata construction and cannot be used for efficient implementations. 
Instead, we describe a tableau algorithm for SHEQ that promises to be amenable to 
optimizations and forms the basis of the highly-optimized DL system iFaCT (Hor- 
rocks, 1999). 

• In Chapter 7 we develop a tableau algorithm for the clique guarded fragment of 
FOL, based on the same ideas usually found in algorithms for modal logics or DLs. 
Since tableau algorithms form the basis of some of the fastest implementations of 
DL systems, we believe that this algorithm is a viable starting point for an efficient 
implementation of a decision procedure for CGF. Since many DLs are embeddable 
into CGF, such an implementation would be of high interest. 

• In a final chapter, we conclude. 

Some of the results in this thesis have previously been published. The PSpace- 
algorithm for ACCQ has been reported in (Tobies, 1999b) and is extended to deal with 
inverse roles and conjunction of roles in (Tobies, 2001). NExpTlME-completeness of ACCQE 
with cardinality restrictions is presented in (Tobies, 1999a, 2000), where the latter publica- 
tion establishes the connection of reasoning with nominals and with cardinality restrictions. 
The ST-algorithm is presented in (Horrocks, Sattler, & Tobies, 2000a), a description of the 
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tableau algorithm for SKLQ can be found in (Horrocks, Sattler, & Tobies, 1999). Finally, 
the tableau algorithm for CGF has previously been published in (Hirsch & Tobies, 2000). 



10 Chapter 1. Introduction 



Chapter 2 
Preliminaries 



In this chapter we give a more formal introduction to Description Logics and their inference 
problems. We define syntax and semantics of the "basic" DL ACC and of the terminological 
and assertional formalism used in this thesis. Based on these definitions, we introduce a 
number of inference problems and show how they are interrelated. 

2.1 The Basic DL ACC 

Schmidt-SchauB and Smolka (1991) introduce the DL ACC, which is distinguished in that 
it is the "smallest" DL that is closed under all Boolean connectives, and give a sound and 
complete subsumption algorithm. Unlike the other DL inference algorithms developed at 
that time, they deviated from the structural paradigm and used a new approach, which, 
due to its close resemblance to first-order logic tableau algorithms, was later also called 
tableau algorithm. Later, Schild's (1991) discovery that ACC is a syntactic variant of the 
basic modal logic K made it apparent that Schmidt-SchauB and Smolka had re-invented in 
DL notation the tableau-approach that had been successfully applied to modal inference 
problems (see, e.g., Ladner, 1977; Halpern & Moses, 1992; Gore, 1998). 

The DL ACC allows complex concepts to be built from concept and relation names using 
the propositional constructors n (and, class intersection), U (or, class union), and -i (not, 
class complementation) . Moreover, concepts can be related using universal and existential 
quantification along role names. 

Definition 2.1 (Syntax of ACC) 

Let HQ.be a set of concept names and NR he a set of role names. The set of ^IGC-concepts 
is built inductively from these using the following grammar, where A G NC and R G NR: 

C::=A\^C\C 1 nC 2 \C 1 UC 2 \ VR.C \ 3R.C. 

o 
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For now, we will use an informal definition of the size \C\ of a concept C: we define 
|C| to be the number of symbols necessary to write down C over the alphabet NC U NR U 
{n, U, -i, V, 3, (, )}. This will not be the definitive definition of the size of the concept 
because it relies on an unbounded alphabet (NC and NR are arbitrary sets), which makes 
it unsuitable for complexity considerations. We will clarify this issue in Definition 3.11. 

Starting with (Brachman & Levesque, 1984), semantics of DLs model concepts as sets 
and roles as binary relations. Starting from an interpretation of the concept and role 
names, the semantics of arbitrary concepts are defined by induction over their syntactic 
structure. For ACC, this is done as follows. 

Definition 2.2 (Semantics of ACC) 

The semantics of ACC -concepts is defined relative to an interpretation X = (A x , x ), where 
A x is a non-empty set, called the domain of X, and x is a valuation that defines the 
interpretation of concept and relation names by mapping every concept name to a subset 
of A x and every role name to a subset of A 1 x A 1 . To obtain the semantics of a complex 
concept this valuation is inductively extended by setting: 

{^Cf = A I \c I {d n C 2 f = C x n C x (d u c 2 ) x = C x uC x 

(\/R.C) x = {x G A 1 I for all y G A x , (x, y) G R x implies y G C x } 
(3R.C) X = {x G A 1 | there is a y G A 1 with (x, y) G R x and y eC 1 }. 

A concept C is satisfiable iff there is an interpretation X such that C x ^ 0. A concept 
C is subsumed by a concept D (written C □ D) iff, for every interpretation X, C x C D x . 
Two concepts C, D are equivalent (written C = D) iff C C D and D □ C. o 



From this definition it is apparent, as noticed by Schild (1991), that ACC is a syntactic 
variant of the propositional (multi-) modal logic K m . More precisely, for a set of concept 
names NC and role names NR, the logic ACC corresponds to the modal logic K m with 
propositional atoms NC and modal operators {{R), [R] | R G NR} where the Boolean 
operators of ACC (n, U, ->) correspond to the Boolean operators of K m (A, V, ->), existential 
restrictions over a role R to the diamond modality (R), and universal restrictions over a 
role R to the box modality [R]. Applying this syntactic transformation in either direction 
yields, for every ^IGC-concept C, an equivalent K m -formula 4>c and, for every K m -formula 
0, an equivalent ^IGC-concept C$. A similar correspondence exists also for more expressive 
DLs. 

We will often use T as an abbreviation for an arbitrary tautological concept, i.e., a 
concept with T x = A x for every interpretation X. E.g., T = A U ->A for an arbitrary 
concept name A G NC. Similarly, we use _L as an abbreviation for an unsatisfiable concept 
(l 1 = for every interpretation X). E.g., _L = A n ->A for an arbitrary A G NC. Also, 
we will use the standard logical abbreviations C — > D for ->C U D and C <-> D for 
C -> D n D -> C. 
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2.2 Terminological and Assertional Formalism 

Starting from the initial kl-one system (Brachman & Schmolze, 1985), DL systems allow 
to express two categories of knowledge about the application domain: 

• terminological knowledge, which is stored in a so-called TBox and consists of general 
definition of concepts and knowledge about their interrelation, and 

• assertional knowledge, which is stored in a so-called ABox and consist of a (partial) 
description of a specific situation consisting of elements of the application domains. 

It should be noted that there are DL systems, e.g., FaCT (Horrocks, 1998), that do 
not support ABoxes but are limited to TBoxes only. In contrast to this, all systems that 
support ABoxes also have some kind of support for TBoxes. 

Different DL systems allow for different kinds of TBox formalism, which has an impact 
on the difficulty of the various inference problems. Here, we define the most general form 
of TBox formalism usually studied — general axioms — and describe other possibilities as a 
restriction of this formalism. 

Definition 2.3 (General Axioms, TBox) 

A general axiom is an expression of the form C^DorC = D where C and D are concepts. 
A TBox T is a finite set of general axioms. 

An interpretation X satisfies a general axiom C C D (C = D) iff C 1 C D x (C 1 = D x ). 
It satisfies T iff it satisfies every axiom in T. In this case, T is called satisfiable, X is called 
a model of T and we write X \= T . 

Satishability, subsumption and equivalence of concepts can also be defined w.r.t. TBoxes: 
a concept C is satisfiable w.r.t. T iff there is a model XofT with C 1 ^ 0. C is subsumed 
by D w.r.t. T iff C 1 C D 1 for every model X ofT. Equivalence w.r.t. T is defined analo- 
gously and denoted with |Z r . o 

Most DL systems, e.g., KRIS (Baader & Hollunder, 1991), allow only for a limited form 
of TBox that essentially contains only macro definitions. This is captured by the following 
definition. 

Definition 2.4 (Simple TBox) 

A TBox Tis called simple iff 

• the left-hand side of axioms consist only of concept names, that is, T consists only of 
axioms of the form AC. D and A = D for A E NC, 

• a concept name occurs at most once as the left-hand side of an axiom in T, and 

• T is acyclic. Acyclicity is defined as follows: A E NC is said to "directly use" B E NC 
ifA = DETorA\ZDET and B occurs in D; "uses" is the transitive closure of 
"directly uses". We say that T is acyclic if there is no A E NC that uses itself. 

o 
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Partial descriptions of the application domain can be given as an ABox. 
Definition 2.5 (ABox) 

Let Nl be a set of individual names. For individual names x,y G Nl, a concept C, and a 
role name R, the expressions x : C , (x, y) : R and x^y are assertional axioms. An ABox 
A is a finite set of assertional axioms. 

To define the semantics of ABoxes we require interpretations, additionally, to map every 
individual name x G Nl to an element x 1 of the domain A 1 . 

An interpretation X satisfies an assertional axiom x:Ciffx I E C 1 ', it satisfies (x, y) : R 
iff (x 1 , y 1 ) G R 1 , and it satisfies xj^y iffx 1 ^ y 1 . X satishes A iff it satishes every assertional 
axiom in A. If such an interpretation X exists, then A is called satisfiable, X is called a 
model of A, and we write X \= A. o 

To decide whether X \= A for an interpretation X and an ABox A, the interpretation 
of those individuals that do not occur in A is irrelevant (Nebel, 1990a; Buchheit, Donini, 
& Schaerf, 1993). Thus, to define a model of an ABox A it is sufficient to specify the 
interpretation of those individuals occurring in A. Our definition of ABoxes is slightly 
different from what can usually be found in the literature, in that we do not impose the 
unique name assumption. The unique name assumption requires that every two distinct 
individuals must be mapped to distinct elements of the domain. We do not have this 
requirement but include explicit inequality assertions between two individuals as assertional 
axioms. It is clear that our approach is more general than the unique name assumption 
because inequality can be asserted selectively only for some individual names. We use this 
approach due to its greater flexibility and since it allows for a more uniform treatment of 
ABoxes in the context of tableau algorithms, which we will encounter in Chapter 3. 

Definition 2.6 (Knowledge Base) 

A knowledge base (KB) K, = (T, A) consists of a TBox T and an ABox A. An interpre- 
tation X satisfies ICiffX\=T and X \= A. In this case, K, is called satisfiable, X is called a 
model of K and we write X \= JC. o 

2.3 Inference Problems 

From the previous definitions, one can immediately derive a number of (so called standard) 
inference problems for DL systems that are commonly studied. Here, we quickly summarize 
the most important of them and show how they are interrelated. 

• Concept satisfiability, i.e., given a concept C, is C satisfiable (maybe w.r.t. a 
TBox T)? This inference allows to determine if concepts in the KB are contradictory 
(describe the empty class). 

• Concept subsumption, i.e., given two concepts C, D, is C subsumed by D (maybe 
w.r.t. a TBox T)? Using this inference, concepts defined in a TBox can be arranged in 
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a subsumption quasi-order that reflects the specialisation/generalisation hierarchy of 
the concepts. Calculation of the subsumption hierarchy is one of the main inferences 
used by many applications of DL systems (e.g., Rector & Horrocks, 1997; Schulz & 
Hahn, 2000; Bechhofer & Horrocks, 2000; Franconi & Ng, 2000). 

For any DL that is closed under Boolean operations, subsumption and (un-)satis- 
fiability are mutually reducible: a concept C is unsatisfiable w.r.t. a TBox T iff 
C C r _|_. Conversely, C C r D iff C n ->D is unsatisfiable w.r.t. T. 

Concept satisfiability and subsumption are problems that are usually considered only 
w.r.t. TBoxes rather than KBs. The reason for this is the fact (Nebel, 1990a; Buchheit 
et al., 1993) that the ABox does not interfere with these problems as long as the KB 
is satisfiable. W.r.t. unsatisfiable KBs, obviously every concept is unsatisfiable and 
every two concepts mutually subsume each other. 

• Knowledge Base Satisfiability, i.e., given a KB /C, is /C satisfiable? This inference 
allows to check whether the knowledge stored in the KB is free of contradictions, 
which is maybe the most fundamental requirement for knowledge in DL systems. 
For a KB that contains a contradiction, i.e., is not satisfiable, arbitrary conclusion 
can be drawn. 

Concept satisfiability (and hence concept subsumption) can be reduced to KB sat- 
isfiability: a concept C is satisfiable w.r.t. a (possibly empty) TBox T iff the KB 
(T, {x : C}) is satisfiable. 

• Instance Checking, i.e., given a KB /C, an individual name x, and a concept C, 
is x 1 G C x for every model X of /C? In this case, x is called an instance of C w.r.t. 
/C. Using this inference it is possible to deduce knowledge from a KB that is only 
implicitly present, e.g., it can be deduced that an individual x is an instance of a 
concept C in every model of the knowledge base even though x : C is not asserted 
explicitly in the ABox — it follows from the other assertions in the KB. 

Instance checking can be reduced to KB (un-) satisfiability. For a KB K, = (T,A), x 
is an instance of C w.r.t. /C iff the KB (T, A U {x : ->C}) is unsatisfiable. 

All the mentioned reductions are obviously computable in linear time. Hence, KB 
satisfiability can be regarded as the most general of the mentioned inference problems. As 
we will see in a later chapter, for some DLs, it is also possible to polynomially reduce KB 
satisfiability to concept satisfiability. 
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Chapter 3 

Reasoning in Description Logics 



This chapter starts with an overview of methods that have been developed to solve DL 
inference problems. We then describe a tableau algorithm that decides concept satisfiability 
and subsumption for ACC and which can be implemented to run in PSpace. Albeit this is 
a well-known result (Schmidt-Schaufi & Smolka, 1991), it is repeated here because it allows 
us to introduce important notions and methods for dealing with tableau algorithms before 
these are applied to obtain results for more expressive logics in the subsequent chapters. 

3.1 Reasoning Paradigms 

Generally speaking, there are four major and some minor approaches to reasoning with 
DLs that will be briefly described here. Refer to (Baader & Sattler, 2000) for a more 
history-oriented introduction to reasoning with DLs. 

Structural algorithms The early DL systems like KL-ONE (Brachman & Schmolze, 
1985) and its successor systems back (Quantz & Kindermann, 1990), k-rep (Mays, 
Dionne, & Weida, 1991), or LOOM (MacGregor, 1991) used structural algorithms that 
rely on syntactic comparison of concepts in a suitable normal form to decide subsumption. 
Nebel (1990a) gives a formal description of an algorithm based on this approach. Usually, 
these algorithms had very good (polynomial) run-time behaviour. Tractability was a major 
concern in the development of DL systems and algorithms with super-polynomial runtime 
were considered unusable in practical applications (Levesque & Brachman, 1987). Yet, as 
it turned out, even DLs with very limited expressive power prohibit tractable inference 
algorithms (Brachman & Levesque, 1984; Nebel, 1990b) and for some, like kl-one, sub- 
sumption is even undecidable (Schmidt-Schaufi, 1989). Consequently, complete structural 
algorithms are known only for DLs of very limited expressivity. 

This limitations were addressed by DL researchers in three general ways: some system 
developers deliberately committed to incomplete algorithms to preserve the good run-time 
behaviour of their systems. Others proceeded by carefully tailoring the DL to maximise 
its expressivity while maintaining sound and complete structural algorithms. Represen- 
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tatives of the former approach are the back and the LOOM system while CLASSlc(Patel- 
Schneider, McGuiness, Brachman, Resnick, & Borgida, 1991) follows the latter approach 
with a "nearly" complete structural subsumption algorithm (Borgida & Patel-Schneider, 
1994). The third approach was to develop algorithms that are capable to deal with more 
expressive DLs despite the higher complexity. This required departure from the methods 
employed so far. 

Tableau algorithms The first such algorithm was developed by Schmidt-SchauB and 
Smolka (1991) for the DL ADZ, and the employed methodology proved to be useful to 
decide subsumption and other inference problems like concept satisfiability also for other 
DLs (Hollunder, Nutt, & Schmidt-SchauB, 1990; Hollunder & Baader, 1991; Baader, 1991; 
Hanschke, 1992). Due to their close resemblance to tableau algorithms for first-order 
predicate logic (FOL) they were also called tableau algorithms. For many DLs, it was 
possible to obtain algorithms based on the tableau approach that match the known worst- 
case complexity of the problem (see, e.g., Donini, Lenzerini, Nardi, & Nutt, 1991a, 1991b; 
Donini, Hollunder, Lenzerini, Spaccamela, Nardi, & Nutt, 1992, for a systematic study). 
Although the inference problems for these DLs are usually at least NP- or even PSpace- 
hard, systems implementing the tableau approach, like KRIS (Baader & Hollunder, 1991) 
or CRACK (Bresciani, Franconi, & Tessaris, 1995), show reasonable runtime performance 
on application problems and more recent systems that employ sophisticated optimization 
techniques, like Horrock's FaCT system (1998) or Patel-Schneider's DLP (2000), can deal 
with problems of considerable size, even for ExpTlME-hard DLs. For theses logics, the 
employed tableau algorithms exceed the known worst-case complexity of the problems, 
but are rather biased towards optimizability for "practical" cases. Indeed, for ExpTime- 
complete DLs, it turns out to be very involved to obtain tableau algorithms with optimum 
worst-case complexity (Donini & Massacci, 2000). 

Translational approaches Schild's discovery (1991) that DLs are syntactic variants of 
modal logics made it possible to obtain inference procedures for DLs by simply borrow- 
ing the methods from the corresponding modal logic. This approach has been refined for 
more expressive DLs and a number of (worst-case) optimal decision procedures for very 
expressive — usually ExpTlME-complete — DLs were obtained by sophisticated translation 
into PDL (De Giacomo & Lenzerini, 1994a; De Giacomo & Lenzerini, 1994c, 1996; De Gi- 
acomo, 1995) or the modal /i-calculus (Schild, 1994; De Giacomo & Lenzerini, 1994b). 
While many interesting complexity results could be obtained in this manner, there exists 
no implementation of a DL system that utilizes this approach. Experiments indicate (Hor- 
rocks et al., 2000a) that it will be very hard to obtain efficient implementations based on 
this kind of translations. More recently, modal logicians like Areces and de Rijke (2000) 
have advocated hybrid modal logics (Areces et al., 2000; Areces, 2000) as a suitable target 
for the translation of DLs and obtain novel theoretical results and decision procedures. It 
is unclear if these decision procedures can be implemented efficiently 

A different approach utilizes translation into FOL. Already Brachman and Levesque 
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(1984) use FOL to specify the semantics of their DL and inference problems for nearly all 
DLs (and their corresponding modal logics) are easily expressible in terms of satisfiability 
problems for (extensions of) FOL. Since FOL is undecidable, one does not immediately 
obtain a decision procedure in this manner. So, these approaches use more restricted and 
hence decidable fragments of FOL as target of their translation. Borgida (1996) uses the 
two-variable fragment of FOL to prove decidability of an expressive DL in NExpTime 
while De Nivelle (2000) gives a translation of a number of modal logics into the guarded 
fragment to facilitate the application of FOL theorem proving methods to these logics. 
Schmidt (1998) uses a non-standard translation into a fragment of FOL for which decision 
procedures based on a FOL theorem prover exist. Areces et. al (2000) show that careful 
tuning of standard FOL theorem proving methods also yields a decision procedure for the 
standard translation. The latter approaches are specifically biased towards FOL theorem 
provers and make it possible to utilize the massive effort spent on the implementation 
and optimization of FOL theorem provers to reason with DLs. It seems though, that the 
translation approach leads to acceptable but inferior runtime when compared with tableau 
systems (Massacci & Donini, 2000; Horrocks, Patel-Schneider, & Sebastiani, 2000). 

Automata based methods Many DL and modal logics possess the so-called tree model 
property, i.e., every satisfiable concept has — under a suitable abstraction — a tree-shaped 
model. This makes it possible to reduced the satisfiability of a concept to the existence 
of a tree with certain properties dependent on the formula. If it is possible to capture 
these properties using a tree automaton (Gecseg & Steinby, 1984; Thomas, 1992), satis- 
fiability and hence subsumption of the logic can be reduce to the emptiness problem of 
the corresponding class of tree automata (Vardi & Wolper, 1986). Especially for DLs with 
ExpTlME-complete inference problems, where it is difficult to obtain tableau algorithms 
with optimum complexity, exact complexity results can be obtained elegantly using the 
automaton approach (Calvanese, De Giacomo, & Lenzerini, 1999; Lutz & Sattler, 2000). 
Yet, so far it seems impossible to obtain efficient implementations from automata-based 
algorithms. The approach usually involves an exponential step that occurs in every case 
independent of the "difficulty" of the input concept and cannot be avoided by existing 
methods. This implies that such an algorithm will exhibit exponential behaviour even for 
"easy" instances, which so far prohibits the use of the approach in practice. 

Other approaches In addition to these approaches, there also exist further, albeit less 
influential, approaches. Instead of dealing with DLs as fragments of a more expressive 
formalism, the SAT-based method developed by Giunchiglia and Sebastiani (1996) uses 
the opposite approach and extend reasoning procedures for the less expressive formalism 
of propositional logic to DLs. Since highly sophisticated SAT-solvers are available, this 
approach has proven to be rather successful. Yet, it cannot compete with tableau based 
algorithms (Massacci, 1999) and so far is not applicable to DLs more expressive than ACC. 

The inverse method (Voronkov, 2000) takes a radically different approach to satisfiabil- 
ity testing. It tries to prove unsatisfiability of a formula in a bottom-up manner, by trying 
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to derive the input formula starting from "atomic" contradictions. There exists only a 
very early implementation of the inverse method (Voronkov, 1999), which shows promising 
run-time performance but at the moment it cannot compete with tableau based systems 
(Massacci, 1999). 

Both the SAT-based approach and the inverse method have so far not been studied 
with respect to their worst-case complexity. 

3.2 Tableau Reasoning for ^KC-satisfiability 

Even though KB satisfiability is the most general standard inference problem, it is also 
worthwhile to consider solutions for the less general problems. For many applications 
of DL systems, the ABox does not play a role and reasoning is performed solely on the 
terminological level (Rector & Horrocks, 1997; Schulz & Hahn, 2000; Bechhofer & Horrocks, 
2000; Franconi & Ng, 2000). For these applications, the additional overhead of dealing 
with ABoxes is unnecessary. Additionally, ABoxes do not have a resemblance in the modal 
world (with the exception of hybrid modal logics, see (Areces & de Rijke, 2000)) and 
hence theoretical results obtained for KB inferences do not transfer as easily as results 
for reasoning with TBoxes, which often directly apply to modal logics. From a pragmatic 
point of view, since full KB reasoning is at least as hard as reasoning w.r.t. TBoxes, it is 
good to know how to deal (efficiently) with the latter problem before trying to solve the 
former. Finally, as we will see in Section 3.2.3, sometimes concept satisfiability suffices to 
solve the more complicated inference problems. 

Schmidt-SchauB and Smolka (1991) were the first to give a complete subsumption algo- 
rithm for ACC. The algorithm they used followed a new paradigm for the development of 
inference algorithms for DLs that proved to be applicable to a vast range of DL inference 
problems and, due to its resemblance to tableau algorithms for FOL, was later called the 
tableau approach (see Baader & Sattler, 2000, for an overview of tableau algorithms for 
DLs). After the correspondence of DLs and modal logics had been pointed out by Schild 
(1991), it became apparent that the tableau algorithms developed for DLs also closely 
resembled those used by modal logicians. The tableau approach has turned out to be par- 
ticularly amenable to optimizations and some of the most efficient DL and modal reasoner 
currently available are based on tableau algorithms (see Massacci & Donini, 2000, for a 
system comparison). 

Generally speaking, a tableau algorithm for a DL tries to prove satisfiability of a concept 
(or a knowledge base) by trying to explicitly construct a model or some kind of structure 
that induces the existence of a model (a pre-model). This is done by manipulating a 
constraint system — some kind of data structure that contains a partial description of a 
model or pre-model — using a set of completion rules. Such constraint systems usually 
consist of a number of individuals for which role relationships and membership in the 
extension of concepts are asserted, much like this is done in an ABox. Indeed, for ACC 
and the DLs considered in the next chapter, it is convenient to use the ABox formalism to 
capture the constraints. For more expressive DLs, it will be more viable to use a different 
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data structure, e.g., to emphasise the graph structure underlying the ABox. 

Independent of the formalism used to express the constraints, completion of such a 
constraint system is performed starting from an initial constraint system, which depends 
on the input concept (or knowledge base), until either an obvious contradiction (a clash) 
has been generated or no more rules can be applied. In the latter case, the rules have 
been chosen in a way that a model of the concept (or knowledge base) can be immediately 
derived from the constraint system. 

Definition 3.1 (Negation Normal Form) 

In the following, we will consider concepts in negation normal form (NNF), a form in which 
negations (-> ) appear only in front of concept names. Every ACC-concept can be equiva- 
lently transformed into NNF by pushing negation inwards using the following equivalences: 



Note that every ACC-concept can be transformed into NNF in linear time. 

For a concept C in NNF, we denote the set of sub-concepts of C by sub(C). Obviously, 



3.2.1 Deciding Concept Satisfiability for ACC 

We will now describe the ^4/jC-algorithm that decides concept satisfiability (and hence con- 
cept subsumption) for ACC. As mentioned before, we use ABoxes to capture the constraint 
systems generated by the ^4/jC-algorithm. 

Algorithm 3.2 (^LGC-algorithm) 

An ABox A contains a clash iff, for an individual name x G Nl and a concept name A e NC, 
{x : A, x : ~<A} C A. Otherwise, A is called clash-free. 

To test the satisfiability of an ACC-concept C in NNF, the ACC-algorithm works as 
follows. Starting from the initial ABox Aq = {x : C} it applies the completion rules from 
Figure 3.1, which modify the ABox. It stops when a clash has been generated or when no 
rule is applicable. In the latter case, the ABox is complete. The algorithm answers "C is 
satisfiable" iff a complete and clash-free ABox has been generated. 

From the rules in Figure 3.1, the — * u -rule is called non-deterministic while the other 
rules are called deterministic. The — >a-rule is called generating, while the other rules are 
called non-generating. o 

The .4/jC-algorithm is a non-deterministic algorithm due to the — > u -rule, which non- 
deterministically chooses which disjunct to add for a disjunctive concept. Also, we have 
not specified a precedence that determines which rule to apply if there is more than one 
possibility. To prove that such a non-deterministic algorithm is indeed a decision procedure 
for satisfiability of ^IGC-concepts, we have to establish three things: 



-.(Ci n C 2 ) = -.Ci U ^C 2 
-.(Ci U C 2 ) = -.Ci n -.C 2 

= c 



-NR.C = BR.^C 
^3R.C = MR.^C 



the size of sub{C) is bounded by \C 



o 



22 



Chapter 3. Reasoning in Description Logics 



Figure 3.1 The completion rules for ACC 
x : Ci n C 2 G A and 
{x :C u x: C 2 } % A 
A-> n AU {x : d,x : C 2 } 

x : Ci U C 2 G A and 
{x:C u x:C 2 }nA = Hi 
A ^ n AU {x : D} for some D G {C u C 2 } 

x : ELR.Z) G .A and 

there is no y with {(x, y) : R,y : D} C A 
A ^3 A U {(#, y) : R, y : £)} for a fresh individual ?/ 

x : Vi?./) G A and 

there is a y with (x, y) : R G A and y : D ^ A 
A — >v A U {y : D} 



1. Termination, i.e., every sequence of rule-applications terminates after a finite num- 
ber of steps. 

2. Soundness, i.e., if the algorithm has generated a complete and clash-free ABox for 
C, then C is satisfiable. 

3. Completeness, i.e., for a satisfiable concept C there is a sequence of rule applications 
that leads to a complete and clash-free ABox for C. 

When dealing with non-deterministic algorithms, one can distinguish two different kinds 
of non-determinism, namely don't-know and don't-care non-determinism. Choices of an 
algorithm that may affect the result are called don't-know non-deterministic. For the 
AGC-algorithm, the choice of which disjunct to add by the — ►□-rule is don't-know non- 
deterministic. When dealing with the initial ABox 

A= {x : AU(B\1^B)}, 

the algorithm will only find a clash-free completion of A if the — >u-rule chooses to add the 
assertion xq : A. In this sense, adding xq : A is a "good" choice while adding xq : B n ->B 
would be a "bad" choice because it prevents the discovery of a clash-free completion of 
A even though there is one. For a (necessarily deterministic) implementation of the ACC- 
algorithm, this implies that exhaustive search over all possibilities of don't-know non- 
deterministic choices is required to obtain a complete algorithm. 

Non-deterministic choices that don't effect the outcome of the algorithm in the sense 
that any choice is a "good" choice are called don't-care non-deterministic. Don't-care non- 
determinism is also (implicitly) present in the AGC-algorithm. Even though in an ABox 
several rules might be applicable at the same time, the algorithm does not specify which 
rule to apply to which constraint in which order. On the contrary, it will turn out that, 
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whenever a rule is applicable, it can be applied in a way that leads to the discovery of 
a complete and clash-free ABox for a satisfiable concept (Lemma 3.9). This implies that 
in case of a (deterministic) implementation of the ^4/jC-algorithm one is free to choose an 
arbitrary strategy which rule to apply where and when without sacrificing the completeness 
of the algorithm, although the efficiency of the implementation might depend on the choice 
of the employed strategy. 

Termination 

The general idea behind the termination proofs of the tableau algorithms we will encounter 
in this thesis is the following: 

• The concepts and roles appearing in a constraint are taken from a finite set. 

• Paths in the constraint system are of bounded length and every individual has a 
bounded number of successors. 

• The application of a rule either adds a constraint for an individual already present 
in the constraint system, or it adds new individuals. No constraints or individuals 
are ever deleted, or, if deletion takes place, the number of deletions is bounded. 

Together, this implies termination of the tableau algorithm since an infinite sequence 
of rule applications would either lead to an unbounded number of constraints for a sin- 
gle individual or to infinitely many individuals in the constraint system. Both stand in 
contradiction to the mentioned properties. 

To prove the termination of the *4/jC-algorithm, it is convenient to "extract" the under- 
lying graph-structure from an ABox and to view it as an edge and node labelled graph. 

Definition 3.3 

Let A be an ABox. The graph G A induced by A is an edge and node labelled graph 
G A = (V, E, L) defined by 

V = {x G Nl | x occurs in A}, 
E = {(x,y) | (x,y) : R G A}, 
L(x) = {D \ x : D e A}, 
L{x,y) = {R\ (x,y) : Re A}. 

o 

It is easy to see that, for any ABox A generated by a sequence of applications of the 
completion rules for ACC from an initial ABox {x : C}, the induced graph G A satisfies 
the following properties: 

• Gvi is a tree rooted at xq. 

• For any node x G V, L(x) C sub(C). 
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• For any edge (x,y) G E, L(x,y) is a singleton {R} for a role R that occurs in C. 

A proof of this properties can easily be given by induction on the number of rule appli- 
cations and is left to the reader. Moreover, it is easy to show the following lemma that 
states that the graph generated by the ^4CC-algorithm is bounded in the size of the input 
concept. 

Lemma 3.4 

Let C be an ACC-concept in NNF and A an ABox generated by the ACC -algorithm by a 
sequence of rule applications from the initial ABox {x : C}. Then the following holds: 

1. For every node x, the size of~L(x) is bounded by \C\. 

2. The length of a directed path in is bounded by \C\. 

3. The out-degree ofG^ is bounded by \C\. 
Proof. 

1. For every node x, L(x) C sub{C). Hence, |L(x)| < \sub{C)\ < \C\. 

2. For every node x we define £{x) as the maximum nesting of existential or universal 
restrictions in a concept in L(x). Obviously, £(xq) < \C\. Also, if (x, y) G E, then 
£{x) > £{y)- Hence, any path xi, . . . Xk in G^ induces a sequence £(xi) > • • • > £(xk) 
of non- negative integers. Since G_a is a tree rooted at x , the longest path starts with 
x and is bounded by |C|. 

3. Successors of a node x are only generated by an application of the ^^-rule, which 
generates at most one successor for each concept of the form 3R.D in L(x). Together 
with (1), this implies that the out-degree is bounded by \C\. ■ 

From this lemma, termination of the ^4CC-algorithm is a simple corollary: 
Corollary 3.5 (Termination) 

Any sequence of rule-applications of the ACC -algorithm terminates after a finite number of 
steps. 

Proof. A sequence of rule-applications induces a sequence of trees whose depth and out- 
degree is bounded by the size of the input concept by Lemma 3.4. Moreover, every rule 
application adds a concept to the label of a node or adds a node to the tree. No nodes are 
ever deleted from the tree and no concepts are ever deleted from the label of a node. 

Hence, an unbounded sequence of rule-applications would either lead to an unbounded 
number of nodes or to an unbounded label of one of the nodes. Both cases contradict 
Lemma 3.4. ■ 
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Soundness and Completeness 

Soundness and completeness of a tableau algorithm is usually proved by establishing the 
following properties of the algorithm based on an appropriate notion of satisfiability of 
constraint systems, which is tailored for the needs of every specific DL and tableau algo- 
rithm. 

1. A constraint system that contains a clash is necessarily unsatisfiable. 

2. The initial constraint system is satisfiable iff the input concept (or knowledge base) 
is satisfiable. 

3. A complete and clash-free constraint systems is satisfiable. 

4. For every applicable deterministic rule, its application preserves satisfiability of the 
constraint systems. For every applicable non-deterministic rule, there is a way of 
applying the rule that preserves satisfiability. 

5. For every rule, no satisfiable constraint system can be generated from an unsatisfiable 
one, or, alternatively, 

5'. a complete and clash-free constraint system implies satisfiability of the initial con- 
straint system. 

Property 4 and 5 together are often referred to as local correctness of the rules. 
Theorem 3.6 (Generic Correctness of Tableau Algorithms) 

A terminating tableau algorithm that satisfies the properties mentioned above is correct. 

Proof. Termination is required as a precondition of the theorem. The tableau algorithm is 
sound because a complete and clash-free constraint system is satisfiable (Property 3) which 
implies satisfiability of the initial constraint system (either by Property 5 and induction 
over the number of rule applications of directly by Property 5') and hence (by Property 2) 
the satisfiability of the input concept (or knowledge base). 

It is complete because, given a satisfiable input concept (or knowledge base), the ini- 
tial constraint system is satisfiable (Property 2). Each rule can be applied in a way that 
maintains the satisfiability of the constraint system (Property 4) and, since the algorithm 
terminates, any sequence of rule-applications is finite. Hence, after finitely many steps 
a satisfiable and complete constraint system can be derived from the initial one. This 
constraint system must be clash- free because (by Property 1) a clash would imply unsat- 
isfiability. ■ 
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Specifically, for ACC we use the usual notion of satisfiability of ABoxes. Clearly, for 
a satisfiable concept C, the initial ABox {xq : C} is satisfiable and a clash in an ABox 
implies unsatisfiability. 

It remains to prove that a complete and clash-free ABox is satisfiable and that the rules 
preserve satisfiability in the required manner. The following definition extracts a model 
from a complete and clash-free ABox. 

Definition 3.7 (Canonical Interpretation) 

For an ABox A, the canonical interpretation 1 A = (A 2 - 4 , ■ Ia ) is defined by 

{x G Nl | x occurs in A}, 
{x | x : A G A} for every A G NC, 
{(x, y) | (x, y) : R G A} for every R G NR, 
x for every individual x that occurs in A. 

o 

Lemma 3.8 

Let A be a complete and clash-free ABox. Then A has a model. 

Proof. It is obvious that, for an arbitrary ABox A, the canonical interpretation satisfies 
all assertion of the form (x, y) : R G A. A does not contain any assertions of the form 
x^y. 

By induction on the structure of concepts occurring in A, we show that the canonical 
interpretation X4 satisfies any assertion of the form x : D G A and hence is a model of A. 

• For the base case x : A with A G NC, this holds by definition of X4. 

• For the case x : -vl, since A is clash free, x : A G" A and hence x G" A %A . 

• If x : C\ n C 2 G A, then, since A is complete, also {x : Ci, x : C 2 } C A By induction 
this implies a; G Cf- 4 and a; G Cf- 4 and hence x G (Ci n C 2 ) ;r - 4 . 

• If x : C\ U C 2 G ^4, then, again due the completeness of A, either x : C\ G A or 
x : C 2 G A By induction this yields x G C{ A or rr G Cf- 4 and hence rr G (Ci U C 2 ) Ia . 

• If x : 3R.D G A, then completeness yields {(x, y) : R,y : D} C for some y. By 
construction of X4, (x, y) G R Xa holds and by induction we have y G -D 1 - 4 . Together 
this implies x G (3R.D) Ia . 

• If a; : ^JR.D G .4, then, for any y with (x,y) G i? 1 - 4 , (x, y) : R G ^4 must hold due 
to the construction of X4. Then, due to completeness, y : D G A must hold and 
induction yields y G -D 1 - 4 . Since this holds for any such y, x G (VR.D) Ia . ■ 



A -4 = 
A 1 - 4 = 

x Ia = 
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Lemma 3.9 (Local Correctness) 

1. If A is an ABox and A' is obtained from A by an application of a completion rule, 
then satisfiability of A' implies satisfiability of A. 

2. If A is satisfiable and A' is obtained from A by an application of a deterministic rule, 
then A' is satisfiable. 

3. If A is satisfiable and the — >u-ruie is applicable, then there is a way of applying the 
— >u-ruie such that the obtained ABox A' is satisfiable. 

Proof. 

1. Since A is a subset of A', satisfiability of A' immediately implies satisfiability of A. 

2. Let X be a model of A. We distinguish the different rules: 

• The application of the — » n -rule is triggered by an assertion x : C\ n C 2 G A. 
Since x x G (C\ n C2) 1 , also x 1 G Cf n Cf • Hence, X is also a model for 
.4' = AU {x : Ci,x : C 2 }. 

• The ^3-rule is applied due to an assertion x : 3R.D G A. Since X is a model 
of A, there exists an a G A x with (a; 2 -, a) G -R x and a G -D x . Hence, the 
interpretation X[y 1— > a], which maps y to a and behaves like X on all other 
names, is a model of 4.' = A U {(x, y) : R,y : D}. Note, that this requires y to 
be fresh. 

• The ^v-rule is applied due to an assertions {x : WR.D, (x,y) : R} C A Since 
X |= ^4, y x G -D x must hold. Hence, X is also a model of A' — A U {y : -D}. 

3. Again, let X be a model of 4.. If an assertion x : C\ U C2 triggers the application of 
the — > u -rule, then x 1 G (Ci U C2) J must hold. Hence, at least for one of the possible 
choices ior D E {Ci, C 2 }, x 1 E D 1 holds. For this choice, adding x : D to A leads to 
an ABox that is satisfied by X. ■ 

Theorem 3.10 (Correctness of the AT-algorithm) 

The ACC-algorithm is a non-deterministic decision procedure for satishability of ACC- 
concepts. 

Proof. Termination was shown in Corollary 3.5. In Lemma 3.8 and Lemma 3.9, we have 
established the conditions required to apply Theorem 3.6, which yields correctness of the 
.ACC-algorithm. ■ 



3.2.2 Complexity 

Now that we know that the AC-algorithm is a non-deterministic decision procedure for 
satisfiability of AC-concepts, we want to analyse the computational complexity of the 
algorithm to make sure that it matches the known worst-case complexity of the problem. 
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Some Basics from Complexity Theory 

First, we briefly introduce the notions from complexity theory that we will encounter in 
this thesis. For a thorough introduction to complexity theory we refer to (Papadimitriou, 
1994). 

Let M be a Turing Machine (TM) with input alphabet £. For a function / : N — > N, 
we say that M operates within time f(n) if, for any input string x G £*, M terminates on 
input x after at most steps, where \x\ denotes the length of x. M operates within 

space f(n) if, for any input x G £*, M requires space at most For an arbitrary 

function f(n) we define the following classes of languages: 

TlME(/(n)) = {L C E* | L is decided by a deterministic TM that operates within time f(n)}, 
NTiME(/(n)) = {L C E* | L is decided by a non-deterministic TM that operates within time f(n)}, 
SPACE(/(n)) = {L C E* | L is decided by a deterministic TM that operates within space f(n), 
NSPACE(/(n)) = {L C E* | L is decided by a non-deterministic TM that operates within space /(n)}. 

Since every deterministic TM is a non-deterministic TM, TiME(/(n)) C NTiME(/(n)) 
and SPACE(/(n)) C NSPACE(/(n)) hold trivially for an arbitrary function /. Also, 
Time(/(ti)) C SPACE(/(n)) and NTiME(/(n)) C NSPACE(/(n)) hold trivially for an 
arbitrary / because within time f(n) a TM can consume at most f(n) units of space. 
In this thesis, we will encounter complexity classes shown in Figure 3.2. 

Figure 3.2 Some complexity classes 



PSpace = (J SPACE(n fc ) 

fceN 

NPSpace = [J NSPACE(n fc ) 

fceN 

ExpTime = (J TiME(2" fc ) 

fcGN 

NExpTime = (J NSPACE(2™ fe ) 

fcGN 

k 

2-ExpTime = (J Time(2 2 " ) 

fcGN 

2-NExpTime = (J NTiME(2 2 " fc ) 

km 



It is known that the following relationships hold for these classes: 

PSpace = NPSpace C ExpTime C NExpTime C 2-ExpTime C 2-NExpTime, 

where the fact that PSpace = NPSpace is a corollary of Savitch's theorem (Savitch, 
1970). 
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We employ the usual notion of polynomial many-to-one reductions and completeness: 
let Li,L 2 C E* be two languages. A function r : E* — > E* is a polynomial reduction from 
L\ to L2 iff there exists a G N such that r(x) can be compute within time (9(|:r| fc ) and 
x G Li iff r(x) G L 2 . A language L is /iard for a complexity class C if, for any V G C, 
there exists a polynomial reduction from L' to L. The language L is complete for C if it is 
C-hard and L eC. 

All these definitions are dependent on the arbitrary but fixed finite input alphabet E. 
The choice of this alphabet is inessential as long as it contains at least two symbols. This 
allows for succinct encoding of arbitrary problems and a larger input alphabet can reduce 
the size of the encoding of a problem only by a polynomial amount. All defined complexity 
classes are insensitive to these changes. From now on, we assume an arbitrary but fixed 
finite input alphabet E with at least two symbols. 

Note that this implies that there is not necessarily a distinct symbol for every concept, 
role, or individual name in E. Instead, we assume that the names appearing in concepts are 
suitably numbered. The results we are going to present are insensitive to this (logarithmic) 
overhead and so we ignore this issue from now on. 

Definition 3.11 

For an arbitrary syntactic entity X, like a concept, TBox assertion, knowledge base, etc., 
we denote the length of a suitable encoding of X in the alphabet E with \X\. o 

The Complexity of .4/jC-Satisfiability 

Fact 3.12 (Schmidt-Schaufi & Smolka, 1991, Theorem 6.3) 

Satisfiability of ACC -concepts is PSPACE-compIete. 

Since we are aiming for a PSPACE-algorithm, we do not have to deal explicitly with the 
non-determinism because PSpace = NPSpace. Yet, if naively executed, the algorithm 
behaves worse because it generates a model for a satisfiable concept and there are ACC- 
concepts that are only satisfiable in exponentially large interpretations, i.e., it is possible to 
give a concept C n of size polynomially in n such that any model of C n essentially contains 
a full binary tree of depth n and hence at least 2 n — 1 nodes (Halpern & Moses, 1992). 
Since the tableau generates a full description of a model, a naive implementation would 
require exponential space. 

To obtain an algorithm with optimal worst case complexity, the ^4/jC-algorithm has 
to be implemented in a certain fashion using the so-called trace technique. The key idea 
behind this technique is that instead of keeping the full ABox A in memory simultaneously, 
it is sufficient to consider only a single path in Gjs, at one time. In Lemma 3.4 we have seen 
that the length of such a path is linearly bounded in the size of the input concept and there 
are only linearly many constraints for every node on such a path. Hence, if it is possible 
to explore one path at a time, then polynomial storage suffices. This can be achieved 
by a depth-first expansion of the ABox that selects the rule to apply in a given situation 
according to a specific strategy (immediately stopping with the output "unsatisfiable" if a 
clash is generated). 
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Figure 3.3 A non-deterministic PSpace decision procedure for ACC. 



A£-Sat(C) := sat(x , {x ■ C}) 
sat (a;, A): 

while (the — > n - or the ^ u -rule can be applied) and (A is clash-free) do 
apply the — -> n - or the ^ u -rule to A. 

od 

if A contains a clash then return "not satisfiable" . 
E := {x : 3R.D \ x : 3R.D G ^} 
while E ^ do 

pick an arbitrary x : 3R.D G E 

-4new : = {( x ,y) '■ R-iU '■ D} where y is a fresh individual 
while (the — »y-rule can be applied to AU A new ) do 

apply the — grille and add the new constraints to A new 

od 

if A U A new contains a clash then return "not satisfiable" . 

if sat(y, A U A new ) = "not satisfiable" then return "not satisfiable" 

E := E \ {x : 3R.D \ y:De A new } 

discard Anew from memory 

od 

return "satisfiable" 



Lemma 3.13 

The ACC -algorithm can be implemented in PSpace. 

Proof. Let C be the ^4/jC-concept to be tested for satisfiability. We can assume C to 
be in NNF because transformation into NNF can be performed in linear time. Figure 3.3 
sketches an implementation of the v 4£C-algorithm that uses the trace-technique to preserve 
memory and runs in polynomial space. 

The algorithm generates the constraint system in a depth-first manner: before generat- 
ing any successors for an individual x, the — > n - and ^ u -rule are applied exhaustively. Then 
successors are considered for every existential restriction in A one after another re-using 
space. This has the consequence that a clash involving an individual x must be present 
in A by the time generation of successors for x is initiated or will never occur. This also 
implies that it is safe to delete parts of the constraint system for a successor y as soon as 
the existence of a complete and clash-free "sub" constraint system has been determined. 
Of course, it then has to be ensured that we do not consider the same existential restric- 
tion x : 3R.D more than once because this might lead to non-termination. Here, we do 
this using the set E that records which constraints still have to be considered. Hence, the 
algorithm is indeed an implementation of the ^4/jC-algorithm. 

Space analysis of the algorithm is simple: since Anew is reset for every successor that is 
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generated, this algorithm stores only a single path at any given time, which, by Lemma 3.4, 
can be done using polynomial space only. ■ 

As a corollary, we get an exact classification of the complexity of satisfiability of ACC- 
concepts. 

Theorem 3.14 

Satisfiability of ACC -concepts is PSPACE-compIete. 

Proof. Satisfiability of „4£T-concepts is known to be PSPACE-hard (Schmidt-SchauB 
& Smolka, 1991), which is shown by reduction from the well-known PSPACE-complete 
problem QBF (Stockmeyer & Meyer, 1973). Lemma 3.13 together with the fact that 
PSpace = NPSpace (Savitch's theorem (1970)) yields the corresponding upper complex- 
ity bound. ■ 

It is possible to give an even tighter bound for the complexity of ^4/jC-concept satisfiabil- 
ity and to show that the problem is solvable in deterministic linear space. This was already 
claimed in (Schmidt-SchauB & Smolka, 1991), but a closer inspection of that algorithm by 
Hemaspaandra reveals that it consumes memory in the order of O(nlogn) for a concept 
with length \C\ — n. Hemaspaandra (2000) gives an algorithm that decides satisfiability 
for the modal logic K in deterministic linear space and which is easily applicable to ACC. 

3.2.3 Other Inference Problems for ACC 

Concept satisfiability is only one inference that is of interest for DL systems. In the 
remainder of this chapter we give a brief overview of solutions for the other standard 
inferences for ACC. 

Reasoning with ABoxes 

To decide ABox satisfiability of an ACC- ABox A (w.r.t. an empty TBox), one can simply 
apply the ACC- algorithm starting with A as the initial ABox. One can easily see that the 
proofs of soundness and completeness uniformly apply also to this case. Yet, since the 
generated constraint system is no longer of tree-shape, termination and complexity have 
to be reconsidered. Hollunder (1996) describes pre- completion — a technique that allows 
reduction of ABox satisfiability directly to ^ICC-concept satisfiability. The general idea is 
as follows: all non-generating rules are applied to the input ABox A exhaustively yielding 
a pre-completion A! of A. After that, the ACC- algorithm is called for every individual x of 
A' to decide satisfiability of the concept 

C x := |~| D. 

x:DeA' 
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It can be shown that A' is satisfiable iff C x is satisfiable for every individual x in A' and that 
A is satisfiable iff the non-generating rules can be applied in a way that yields a satisfiable 
pre-completion. Since ABox satisfiability is at least as hard as concept satisfiability, we 
get: 

Corollary 3.15 

Consistency of ACC-ABoxes w.r.t. an empty TBox is PSPACE-compIete. 
Reasoning with Simple TBoxes 

For a simple TBox T, concept satisfiability w.r.t. T can be reduced to concept satisfiability 
by a process called unfolding: 

Let C be an ^4CC-concept and T a simple TBox. The unfolding Cr of C w.r.t. T is 
obtained by successively replacing every defined name in C by its definition from T until 
only primitive (i.e., undefined) names occur. It can easily be shown that C is satisfiable 
w.r.t. T iff Cr is satisfiable. Unfortunately, this does not yield a PSPACE-algorithm, as the 
size of Cr may be exponential in the size of C and T. Lutz (1999) describes a technique 
called lazy unfolding that performs the unfolding of C w.r.t. T on demand, which yields: 

Fact 3.16 (Lutz, 1999, Theorem 1) 

Satisfiability of ACC -concepts w.r.t. to a simple TBox is PSPACE-compIete. 

Finally, the techniques of pre-completion and lazy-unfolding can be combined, which 
yields: 

Corollary 3.17 

Consistency of ACC knowledge bases with a simple TBox is PSPACE-compIete. 
Reasoning with General TBoxes 

If general TBoxes are considered instead of simple ones, the complexity of the inference 
problems rises. 

Theorem 3.18 

Satishability of ACC -concepts (and hence of ABoxes) w.r.t. general TBoxes is ExpTime- 
hard. 

Proof. As mentioned before, ACC is a syntactic variant of the propositional modal logic 
K m (Schild, 1991). As a simple consequence of the proof of ExpTlME-completeness of 
K with a universal modality (Spaan, 1993a)] (i.e., in DL terms, a role linking every two 
individuals), one obtains that the global satisfaction problem for K is an ExpTlME-complete 
problem. The global satisfaction problem is defined as follows: 

Given a K-formula 0, is there a Kripke model DJl such that holds at every 
world in 971? 
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Using the correspondence between ACC and K m , this can be re-stated as an ExpTime- 
complete problem for ACC: 

Given an „4£T-concept C, is there an interpretation J such that C 1 = A x ? 

Obviously, this holds iff the tautological concept T is satisfiable w.r.t. the (non-simple) 
TBox T = {T = C}, which implies that satisfiability of ./ICC-concepts (and hence of 
ABoxes) w.r.t. general TBoxes is ExpTlME-hard. ■ 

A matching upper bound for ACC is given by De Giacomo and Lenzerini (1996) by a 
reduction to PDL, which yields: 



Corollary 3.19 

Satisfiability and subsumption w.r.t. general TBoxes, knowledge base satisfiability and 
instance checking for ACC are ExpTlME-compIete problems. 
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Chapter 4 

Qualifying Number Restrictions 



In this chapter we study the complexity of reasoning with AjOCQ, the extension of ACC 
with qualifying number restrictions. While for ACC, or, more precisely, for its syntactic 
variant K, PSPACE-completeness has already been established quite some time ago by 
Ladner (1977), the situation is entirely different for ACCQ or its corresponding (multi- 
modal logic Gr(Kft). For ACCQ, decidability of concept satisfiability has been shown only 
rather recently by Baader and Hollunder (1991) and the known PSPACE upper complexity 
bound for ACCQ is only valid if we assume unary coding of numbers in the input, which is an 
unnatural restriction. For binary coding no upper bound was known and the problem had 
been conjectured to be ExpTlME-hard by van der Hoek and de Rijke (1995). This coincides 
with the observation that a straightforward adaptation of the translation technique leads to 
an exponential blow-up in the size of the first-order formula. This is because it is possible 
to store the number n in log fc n bits if numbers are represented in A;-ary coding. 

We show that reasoning for ACCQ is not harder than reasoning for ACC (w.r.t. worst- 
case complexity) by presenting an algorithm that decides satisfiability in PSpace, even if 
the numbers in the input are binary coded. It is based on the tableau algorithm for ACC 
and tries to prove the satisfiability of a given concept by explicitly constructing a model 
for it. When trying to generalise the tableau algorithms for ACC to deal with ACCQ, there 
are some difficulties: (1) the straightforward approach leads to an incorrect algorithm; (2) 
even if this pitfall is avoided, special care has to be taken in order to obtain a space-efficient 
solution. As an example for (1), we will show that the algorithm presented in (van der 
Hoek & de Rijke, 1995) to decide satisfiability of Gr(K^), a syntactic variant of ACCQ, 
is incorrect. Nevertheless, this algorithm will be the basis of our further considerations. 
Problem (2) is due to the fact that tableau algorithms try to prove the satisfiability of a 
concept by explicitly building a model for it. If the tested formula requires the existence 
of n accessible role successors, a tableau algorithm will include them in the constructed 
model, which leads to exponential space consumption, at least if the numbers in the input 
are not unarily coded or memory is not re-used. An example for a correct algorithm 
which suffers from this problem can be found in (Hollunder & Baader, 1991) and is briefly 
presented in this thesis. As we will see, the trace technique alone is not sufficient to obtain 
an algorithm that runs in polynomial space. Our algorithm overcomes this additional 
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problem by organising the search for a model in a way that allows for the re-use of space 
for each successor, thus being capable of deciding satisfiability of ACCQ in PSpace. 

Using an extension of these techniques we obtain a PSpace algorithm for the logic 
ACCQJb, which extends ACCQ by expressive role expressions. This solves an open problem 
from (Donini et al., 1997). 

Finally we study the complexity of reasoning w.r.t. general knowledge bases for ACCQJb 
and establish ExpTlME-completeness. This extends the ExpTlME-completeness result for 
the more "standard" DL ACCQT (De Giacomo, 1995). Moreover, the proof in (De Giacomo, 
1995) is only valid in case of unary coding of numbers in the input whereas our proof also 
applies in the case of binary coding. 

4.1 Syntax and Semantics of ACCQ 

The DL ACCQ is obtained from ACC by adding so-called qualifying number restrictions, 
i.e., concepts restricting the number of individuals that are related via a given role instead 
of allowing for existential or universal restrictions only like in ACC. ACCQ is a syntactic 
variant of the graded propositional modal logic Gr(K^). 

Definition 4.1 (Syntax of ACCQ) 

Let NC be a set of atomic concept names and NR be a set of atomic role names. The set of 
^ICCQ-concepts is built inductively from these according to the following grammar, where 
AeHC, Re NR, andne N: 

C ::= A | ^C | d n C 2 | d U C 2 | \/R.C | 3R.C | (<n R C) \ (>n R C). 

o 

Thus, the set of ^LCCQ-concepts is defined similar to the set of „4£T-concepts, with the 
additional rule that, if R G NR, C is an ^ICCQ-concept, and n G N, then also (<n R C) 
and (>n R C) are ^ICCQ-concepts. To define the semantics of ACCQ- concepts, we extend 
Definition 2.2 to deal with these additional concept constructors: 

Definition 4.2 (Semantics of ACCQ) 

For an interpretation X = (A 1 , - 1 ) , the semantics of ACCQ-concepts is defined inductively 
as for ACC-concepts with the additional rules: 

(<n R C f = {x G A 1 | $R J (x, C) < n} and 
(>n R C f = {x G A 1 | %R T {x, C) > n}, 

where §R I (x, C) — {y | (x, y) G R 1 and y G C 1 } and (j denotes set cardinality. o 
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From these semantics, it is immediately clear that we can dispose of existential and 
universal restriction in the syntax without changing the expressiveness of ACCQ, since 
the following equivalences allow the elimination of universal and existential restrictions in 
linear time: 



In the following, we assume that ^IGCQ-concepts are built without existential or univer- 
sal restrictions. To obtain the NNF of an ^4/jCQ-concept, one can "apply" the following 
equivalences (in addition to de Morgan's laws): 



Like for ACC, one can obtain the NNF of an ^IGCQ-concept in linear time. For an ACCQ- 
concept C, we denote the NNF of ->C by ~C. 



Before we present our algorithm for deciding satisfiability of ACCQ, for historic and didactic 
reasons, we present two other solutions: an incorrect one (van der Hoek & de Rijke, 1995), 
and a solution that is less efficient (Hollunder & Baader, 1991). 

4.2.1 An Incorrect Solution 

Van der Hoek and de Rijke (1995) give an algorithm for deciding satisfiability of the graded 
modal logic Gr(K^). Since Gr(K^) is a notational variant of ACCQ, such an algorithm 
could also be used to decide concept satisfiability for ACCQ. Unfortunately, the given 
algorithm is incorrect. Nevertheless, it will be the basis for our further considerations and 
thus it is presented here. It will be referred to as the incorrect algorithm. It is based 
on a tableau algorithm given in (Donini et al., 1997) to decide the satisfiability of the 
DL ACCN ', but overlooks an important pitfall that distinguishes reasoning for qualifying 
number restrictions from reasoning with number restrictions. This mistake leads to the 
incorrectness of the algorithm. To fit our presentation, we use DL syntax in the presentation 
of the algorithm. Refer to (Tobies, 1999b) for a presentation in modal syntax. 

Similar to the ^4/jC-algorithm presented in Section 2.1, the flawed solution is a tableau 
algorithm that tries to build a model for a concept C by manipulating sets of constraints 
with certain completion rules. Again, ABoxes are used to capture constraint systems. 

Algorithm 4.3 (Incorrect Algorithm for ACCQ, van der Hoek & de Rijke, 1995) 

For an ABox A, a role name R, an individual x, and a concept D, let $R A (x : D) he the 
number of individuals y for which {(x,y) : R,y : D} C A. The ABox [z/y]A is obtained 
from A by replacing every occurrence of y by z; this replacement is said to be safe iff, for 



3R.C = (>1RC) 



VR.C = (<0 R -.C) 



->(<nRC) 
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every individual x, concept C, and role name R with {x : (>n R D), (x,y) : R, (x,z) : 
R} C S we have $RWvl A (x, D) > n. 

The definition of a clash is slightly extended from the ACC-case to deal with obviously 
contradictory number restrictions: An ABox A is said to contain a clash, iff 

{x : A, x : ->A} C A or {x : (<m R D),x : (>n R D)} C A. 

for a concept name A, a concept D, and two integers m < n. Otherwise, A is called clash- 
free. An ABox A is called complete iff none of the rules given in Fig. 4.1 is applicable to 
A. 

To test the satisfiability of a concept C, the incorrect algorithm works as follows: it 
starts with the initial ABox {x : C} and successively applies the rules given in Fig. 4.1, 
stopping when a clash occurs. Both the rule to apply and the concept to add (in the 
— > u -ruie) or the individuals to identify (in the — ><-ruie) are selected non-deterministically. 
The algorithm answers "C is satishable" iff the rules can be applied in a way that yields a 
complete and clash-free ABox. o 

The notion of safe replacement of variables is needed to ensure the termination of the 
rule application (see Hollunder & Baader, 1991). The same purpose could be achieved by 
explicitly asserting all successors generated to satisfy an at-least restriction to be unequal 
and preventing the identification of unequal elements. Yet, since this notion of safe re- 
placement recurs in the algorithm of Baader and Hollunder (1991), which we are going to 
describe later on, and since we want to outline an error in the incorrect algorithm, we stay 
as close to the original description as possible. 

Since we are interested in PSpace algorithms, as for ACC, non-determinism poses no 
problem due to Savitch's Theorem, which implies that deterministic and non-deterministic 
polynomial space coincide (Savitch, 1970). 

As described in Section 2.1, to prove the correctness of such a tableau algorithm, we 
need to show three properties of the completion: 

1. Termination: Any sequence of rule applications is finite. 

2. Soundness: If the algorithm terminates with a complete and clash-free ABox A, then 
the tested concept is satisfiable. 

3. Completeness: If the concept is satisfiable, then there is a sequence of rule applica- 
tions that yields a complete and clash-free ABox. 

The error of the incorrect algorithm is, that is does not satisfy Property 2, even though 
the opposite is claimed: 

Claim (van der Hoek & de Rijke, 1995): (Restated in DL terminology) Let C 
be an concept in NNF. C is satisfiable iff {x : C} can be transformed 

into a clash-free complete ABox using the rules from Figure 4.1. 



4.2 Counting Pitfalls 



39 



Figure 4.1 The incorrect completion rules for ACCQ. 
x : C 1 nC 2 G A and 
{x :C u x: C 2 } Z A 
A-* n A\J{x:Ci,x: C 2 } 

x : Ci U C 2 G A and 
{x:C 1 ,x:C 2 }nA = ® 
A ^ u A U {x : D} for some D G {d, C 2 } 

x : (>ra R D) and 
tJi^(a;,D) < n 

A ^> .4 U {(#, y) : i?, y : £>} where y is a fresh variable. 

{a; : (<0 R D), (x, y) : R} C A and 
y : ~£> ^ A 
A^< AU{y : ~D} 

x : (<n R D) e A, R A (x, D) > n > and 
{(#, y) : R, (x, z) : R} C A for some y ^ z and a 
replacing y by z is safe in A 
A ^< [z/y]^l 

"The rules in (van der Hoek & dc Rijke, 1995) do not require {y : D,z : D} e A, as one might expect. 



Unfortunately, the i/-direction of this claim is not true. The problem lies in the fact 
that, while a clash causes unsatisfiability, a complete and clash-free ABox is not necessarily 
satisfiable. The following counterexample exhibits this problem. Consider the concept 

C = (>3 R A) n (<1 RB)H (<1 R -.S). 

On the one hand, C is clearly not satisfiable. Assume an interpretation I with x G C 1 . 
This implies the existence of at least three i?-successors yi,y 2 ,y 3 of x. For each of the y» 
either y { G -B x or y; G (~ I -B) X holds by the definition of • :r . Without loss of generality, there 
are two elements y ix , y i2 such that {y^, y i2 } C which implies x G" (<1 -R B) 1 and hence 

.r c^r"'. 

On the other hand, the ABox A = {xq : C} can be turned into a complete and clash- 
free ABox using the rules from Fig. 4.1, as is shown in Fig. 4.2. Clearly this invalidates 
the claim and thus its proof. 

To understand the mistake of the incorrect algorithm, it is useful to recall how soundness 
is usually established for tableau algorithms. The central idea is that a complete and 
clash-free ABox A is "obviously" satisfiable, in the sense that a model of A can directly be 
constructed from A. For a complete and clash-free ^4/jCQ-ABox A we define the canonical 
interpretation as in Definition 3.7. 

The mistake of the incorrect algorithm is due to the fact that it did not take into 
account that, in the canonical interpretation induced by a complete and clash-free ABox, 
there are concepts satisfied by the individuals even though these concepts do not appear as 
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Figure 4.2 A run of the incorrect algorithm. 



{x:C}^ n > n {x : C, x : (>3 i? A), x : (<1 i2 S), a; : (<1 R -.£)} 

" v ' 

=Ai 

^> >>Ai U : R, Vi :A\i = 1, 2, 3} 

N v ' 

A2 is clash-free and complete, because $R A2 (x,A) = 3 and §R A2 (x,B) = 0. 



constraints in the ABox. In our example, all of the for which B is not explicitly asserted, 
satisfy ->B in the canonical interpretation but this is not reflected in the generated ABox. 

4.2.2 A Correct but Inefficient Solution 

This problem has already been noticed in (Hollunder & Baader, 1991), where an algorithm 
very similar to the incorrect one is presented that correctly decides the satisfiability of 
^ICCQ-concepts. 

The algorithm essentially uses the same definitions and rules. The only substantial 
difference is the introduction of the — > c h 00 sc- r ule, which makes sure that all "relevant" 
concepts that are implicitly satisfied by an individual are made explicit in the ABox. Here, 
relevant concepts for an individual y are those occurring in qualifying number restrictions 
in constraints for variables x such that (x, y) : R appears in the ABox. 

Algorithm 4.4 (The Standard Algorithm for A£CQ, Hollunder & Baader, 1991) 

The rules of the standard algorithm are given in Figure 4.3. The definition of clash is 
modified as follows: an ABox A contains a clash iff 

• {x : A, x : ->A} C A for some individual x and a concept name A, or 

• x : (<n R D) E A and §R A (x,D) > n for some variable x, relation R, concept D, 
and n£N. 

The algorithm works like the incorrect algorithm with the following differences: (1 ) it 
uses the completion rules from Fig. 4.3 (where \x is used as a placeholder for either < or 
>); (2) it uses the definition of clash from above; and (3) it does not immediately stop 
when a clash has been generated but always generates a complete ABox. o 

The standard algorithm is a decision procedure for ^4CCQ-concept satisfiability: 
Theorem 4.5 (Hollunder & Baader, 1991) 

Let C be an ACCQ-concept in NNF. C is satisfiable iff {x : C} can be transformed into a 
clash-free complete ABox using the rules in Figure 4.3. Moreover, each sequence of these 
rule-applications is finite. 
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Figure 4.3 The standard completion rules for ACCQ. 
see Fig. 4.1 

if 1. {x : (xm i? £>), (x, y) : R} C ^4 and 

2. {y:D,y: ~D} n .4 = 
then .4 -^choose -4 U {y : £} where £ G {£>, ~£>} 

if 1. x : (>n £)) e ^4 and 

2. fli^x, D) < n 
then A ^> .4 U {(x, y) : i?, y : D} where y is a new variable. 

if 1. x : (<n .R £>) G -4, tt^" 4 ^, I>) > n, and 

2. {(x, y) : i?, (x, z) : R, y : D , z : D} C ^4, for some t/^z and 

3. for every u with (x, ii):i?£.A, {k:D,m: ~D} Pi ^4. 7^ 0, and 

4. the replacement of y by z is safe in „4 
then A ->< [y/z]A 



While no complexity result is explicitly given in (Hollunder & Baader, 1991), it is easy 
to see that a PSpace result could be derived from the algorithm using the trace technique 
from Section 2.1. 

Unfortunately this is only true if we assume the numbers in the input to be unary 
coded. The reason for this lies in the — »>-rule, which generates n successors for a concept 
of the form (>n R D). If n is unary coded, these successors consume at least polynomial 
space in the size of the input concept. If we assume binary (or A;-ary with k > 1) encoding, 
the space consumption is exponential in the size of the input because a number n can be 
represented in log fc n bits in k-aiy coding. This blow-up cannot be avoided because the 
completeness of the standard algorithm relies on the generation and identification of these 
successors, which makes it necessary to keep them in memory at one time. 

4.3 An Optimal Solution 

In the following, we will now present the algorithm with optimal worst case complexity, 
which will be used to prove the exact complexity result for ACCQ: 

Theorem 4.6 

Satisfiability of ACCQ-concepts is PSPACE-complete, even if numbers in the input are 
represented using binary coding. 

When aiming for a PSpace algorithm, it is impossible to generate all successors of 
an individual in the ABox simultaneously at a given stage as this may consume space 
that is exponential in the size of the input concept. We will give an optimal rule set for 
*4£T<2-satisfiability that does not rely on the identification of successors. Instead we will 
make stronger use of non-determinism to guess the assignment of the relevant concepts to 



'11) ~ 
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the successors by the time of their generation. This will make it possible to generate the 
completion tree in a depth-first manner, which facilitates re-use of space. 

Algorithm 4.7 (The Optimal Algorithm for ACCQ) 

The definition of clash is taken from Algorithm 4.4. 

To test the satisfiability of a concept C, the optimal algorithm starts with the initial 
ABox {xq : C} and successively applies the rules given in Fig. 4.4, stopping when a clash 
occurs. The algorithm answers "C is satishable" iff the rules can be applied in a way that 
yields a complete and clash-free ABox. o 



Figure 4.4 The optimal completion rules for ACCQ.. 
->m ^u: see Fig. 4.1 

^>: if I. x : (>n R D) E A, and 

2. $R A (x, D) < n, and 

3. neither the — > n - nor the — > u -rule apply to a constraint for x 
then A ^> AU {(x,y) : R, y : D, y : Di,...,y: D k } where 

{E u . . . , E k } = {E | x : (Mm R E) E A}, D t E {E h ~Ei}, and 
y is a fresh individual. 



For the different kinds on non-determinism present in this algorithm, compare the 
discussion below Algorithm 3.2. In the proof of Lemma 4.14, it is shown that the choice of 
which rule to apply when is don't-care non-deterministic. Any strategy that decides which 
rule to apply if more than one is applicable will yield a complete algorithm. 

At first glance, the -^>-rule may appear to be complicated and therefore it is explained 
in more detail: like the standard — »>-rule, it is applicable to an ABox that contains the 
constraint x : (>n R D) if there are less than n /^-successors y of x with y : D E A. The 
rule then adds a new successor y of x to A. Unlike the standard algorithm, the optimal 
algorithm also adds additional constraints of the form y : to A for each concept E 

appearing in a constraint of the form x : (txtm R E). Since application of the ^>-rule is 
suspended until no other rule applies to x, by this time A contains all constraints of the 
form x : (com R E) it will ever contain. This combines the effects of both the choose- and 
the — ><-rule of the standard algorithm. 

4.3.1 Correctness of the Optimized Algorithm 

To establish the correctness of the optimal algorithm, we will show its termination, sound- 
ness, and completeness. Again, it is convenient to view A as the graph G_a = (V, E, L) as 
defined in Section 2.1. Since the — >>-rule not only adds sub-concepts of C but in some 
cases also the NNF of sub-concepts, the label L(x) of a node x is no longer a subset of 
sub(C) but rather of the larger set clos(C) defined below. 
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Definition 4.8 

For an ACCQ-concept C we define the closure clos{C) as the smallest set of ACCQ- concepts 
that 

• contains C , 

• is closed under sub-concepts, and 

• is closed under the application of ~. 

o 

It is easy to see that the size of clos(C) is linearly bounded in |C|: 
Lemma 4.9 

For an ACCQ-concept C in NNF, 

$clos(C) < 2 x |C| 

Proof. This is an immediate consequence of the fact that 

clos(C) C sub(C) U | D G sub(C)} 

which can be shown as follows. Obviously, the set sub(C) U {~D | D G sub(C)} contains 
C and is closed under the application of ~ (Note that, for a sub-concept D of a concept 
in NNF, ~~.D = D). Closure under sub-concepts for the concepts in sub{C) is also 
immediate, and can be established for {~D | D G sub(C)} by considering the various 
possibilities for ^IGCQ-concepts. ■ 

Similar to A£C, it is easy to show that the graph for an ABox A generated by 
the optimal algorithm from an initial ABox {xq : C} is a tree with root xq, and for each 
edge (x, y) G E, the label L(x, y) is a singleton. Moreover, for each node x it holds that 
L(x) C dos(C). 

Termination 

First, we will show that the optimal algorithm always terminates, i.e., each sequence of 
rule applications starting from the ABox {x : C} is finite. The next lemma will also be 
helpful when we consider the complexity of the algorithm. 

Lemma 4.10 

Let C he a concept in NNF and A an ABox that is generated by the optimal algorithm 
starting from {x : C}. 

• The length of a path in is limited by \C\. 

• The out-degree of is bounded by \C\ x 2' C 'L 
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Proof. The linear bound on the length of a path in is established as for the ACC- 
algorithm using the fact that the nesting of qualifying number restrictions strictly decreases 
along a path in G4. 

Successors in G(A) are only generated by the — >>-rule. For an individual x this rule 
will generate at most n successors for each (>n R D) e L(x). There are at most |C| such 
concepts in L(x). Hence the out-degree of x is bounded by |C| x 2^, where 2l c l is a limit 
for the biggest number that may appear in C if binary coding is used. ■ 

Corollary 4.11 (Termination) 

Any sequence of rule applications starting from an ABox A = {x : C} of the optimal 
algorithm is finite. 

Proof. The sequence of rules induces a sequence of trees. The depth and the out-degree 
of these trees is bounded by some function in |C| by Lemma 4.10. For each individual x 
the label L(x) is a subset of the finite set clos(C). Each application of a rule either 

• adds a new constraint of the form x : D and hence adds an element to L(x), or 

• adds fresh individuals to A and hence adds additional nodes to the tree G4. 

Since constraints are never deleted and individuals are never deleted or identified, an infinite 
sequence of rule application must either lead to an infinite number of nodes in the trees 
which contradicts their boundedness, or it leads to an infinite label of one of the nodes x 
which contradicts L(x) C clos{C). ■ 



Soundness and Completeness 

We establish soundness and completeness of the optimal algorithm along the lines of Theo- 
rem 3.6. We use a slightly modified notion of ABox satisfiability, which is already implicitly 
present in the definition of clash. If we want to apply Theorem 3.6 to prove the correctness 
of the algorithm, then we need that a clash in an ABox causes unsatisfiability of that 
ABox. For an arbitrary ABox and the definition of clash used by the optimal algorithm, 
this is not the case. For example the ABox 

A = {x : (<1 R A), (x, y) : R, (x, z) : R,y: A, z : A} 

contains a clash but is satisfiable. Yet, if we require, that for all individuals x,y,z, if 
(x, y) : R, (x, z) : R G A and y 7^ z, then y and z must be interpreted with different 
elements of the domain, then a clash obviously implies unsatisfiability. This is captured by 
the definition of the function ~ that maps an ACCQ-ABox to its differentiation A defined 
by 

A = A U {y ^ z I {(x, y) : R, (x, z) : R} C A, y ^ z}. 
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For the proof of soundness and completeness of Algorithm 4.7, an ABox A is called 
satisfiable iff A is satisfiable in the (standard) sense of Definition 2.5. Since^is idempotent, 
the standard and the modified notion of satisfiability coincide for a differentiated ABox A 
and we can unambiguously speak of the satisfiability of A without specifying if we refer to 
the modified or the standard notion. 

Consider the properties required by Theorem 3.6. As discussed before, with this defini- 
tion of satisfiability of ABoxes, it is obvious that, for an ABox A generated by the optimal 
algorithm that contains a clash, A (and hence, by our definition, .4.) must be unsatisfiable 
(Property 1) and that {xq : C} is satisfiable iff C is satisfiable (Property 2). It remains to 
establish Property 3 (a clash-free and complete ABox is satisfiable, Lemma 4.13) and the 
local correctness (Properties 4,5) of the rules (Lemma 4.14). 

For ACT, to prove satisfiability of a complete and clash-free ABox A, we used induction 
over the structure of concepts appearing in constraints in A. This was possible because 
the ACC-rules, when triggered by an assertion x : D, only add constraints to A that involve 
sub-concepts of D. For ACCQ, and specifically for the — >>-rule, this is no longer true and 
hence a proof by induction on the structure of concepts is not feasible. Instead, we will 
use induction on following norm of concepts. 

Definition 4.12 

For an ACCQ-concept D in NNF, then norm \\D\\ is inductively defined by: 

\\A\\ := \\^A\\ : = for A G NIC 

||Ci n C 2 || := Hd U C7 2 || := 1 + ||Ci|| + ||C 2 || 
||(txm R D)\\ := 1 + ||D|| 

o 



The reader may verify that this norm satisfies = \\~D\\ for every concept D. 
Lemma 4.13 

Let A he a complete and clash-free ABox generated by the optimal algorithm. Then A is 
satisfiable. 

Proof. Let A be a complete and clash-free ABox generated by applications of the optimal 
rules and A its differentiation. We show that the canonical interpretation I4, as defined 
in Definition 3.7, is a model of A. 

By definition of I4, all constraints of the form (x, y) : R are trivially satisfied. Also, 
y ^ z implies y lA 7^ z %A by construction of X4. Thus, all remaining assertions in A are of 
the form x : D and are also present in A. Thus, it is sufficient to show that x : D G A 
implies x Ia G D Ia , which we will do by induction on the norm || • || of a concept D. Note 
that, by the definition of Z4, x Ia = x for every individual x that occurs in A. 

• The first base case is -D = A for A G NC. x : A G A immediately implies x G A Xa by 
the definition of X4. The second base case is x : ->A G A. Since A is clash-free, this 
implies x : A G" A and hence x ^ A Ia . This implies x G {-^A) Ia 
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• For the conjunction and disjunction of concepts this follows exactly as in the proof 
of Lemma 3.8. 

• x : (>n R D) G A implies §R A (x, D) > n because otherwise the -^>-rule would be 
applicable and A would not be complete. By induction, we have y G D Xa for each y 
with y : D e A. Hence $R Xa (x, D) > n and thus x G (>n R D) Ia . 

• x : (<n R D) G A implies §R A (x, D) < n because A is clash-free. Hence it is 
sufficient to show that §R Xa (x,D) < jji?" 4 ^, -D) holds. On the contrary, assume 
$R Ia (x,D) > $R A {x,D) holds. Then there is an individual y such that (x, y) : R G A 
and y G D Xa but y : D G" A. The application of the — >>-rule is suspended until the 
propositional rules are no longer applicable to x and hence, by the time y is generated 
by an application of the — >>-rule, A contains the assertion x : (<n R D). Hence, the 
^>-rule ensures y : D G A or y : ~L> G A. Since we have assumed that y : D A, 
this implies y : ~£> G A and, by the induction hypothesis, y G (~D) X - 4 holds, which 
is a contradiction. ■ 

Lemma 4.14 (Local Correctness) 

Let A, A' be ABoxes generated by the optimal algorithm from an ABox of the form {x : 
C}- 

1. If A' is obtained from A by application of the (deterministic) ^ n -rule, then A is 
satishable iff A' is satishable. 

2. If A' is obtained from A by application of the (non-deterministic) — > u - or ^>-ru!e, 
then A is satisfiable if A' is satishable. Moreover, if A is satishable, then the rule 
can always be applied in such a way that it yields a satishable A". 

Proof. ^4 — > A' for any rule — > implies AC A' and, by the definition of ^ A C A', hence, 
if A' is satisfiable then so is A. For the other direction, the — > n - and -^ u -rule can be 
handled as in the proof for ACC in Lemma 3.9. 

It remains to consider the — »>-rule. Let X be a model of A and let x : (>n R D) be the 
constraint that triggers the application of the — >>-rule. Since the — >>-rule is applicable, 
we have $R A (x, D) < n. We claim that there is an a G A 1 with 

(x 1 , a) G R 1 , a G D 1 and a G" {z 1 \ (x, z) : Re A}. (*) 

Before we prove this claim, we show how it can be used to finish the proof. The element 
a is used to "select" a choice of the — >>-rule that preserves satisfiability: let {E 1: . . . , E k } 
be an enumeration of the set {E \ x : (cxim RE) E A}. We set 

A" = AU{(x,y) : R,y : D} U {y : E t | a G E?} U {y : | a E?} 

Obviously, X[y \— > a], the interpretation that maps y to a and agrees with X on all other 
names, is a model for A", since y is a fresh individual and a satisfies (*). The ABox A" 
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is a possible result of the application of the — >>-rule to A, which proves that the — >>-rule 
can indeed be applied in a way that maintains satisfiability of the ABox. 

We will now come back to the claim. It is obvious that there is an a with (x 1 , a) G R x 
and a G D 1 that is not contained in {z 1 \ (x, z) : R, z : D G .4.}, because $R J (x, D) >n> 
$R A (x,D). Yet a might appear as the image of an individual z such that (x,z) : R G A 
but z : D G" A. 

Now, (x, z) : R G A and z : D ^ A implies z : ~D G A. This is due to the fact that the 
constraint (x, z) : R must have been generated by an application of the — »>-rule because 
it has not been an element of the initial ABox. The application of this rule was suspended 
until neither the — > n - nor the ^ u -rule were applicable to x. Hence, if x : (>n R D) is 
an element of A now, then it has already been in A when the — >>-rule that generated z 
was applied. The — >>-rule guarantees that either z : D or z : is added to A, hence 
z : ~D G A. This is a contradiction to z x = a because under the assumption that X is a 
model of A this would imply a G (~-D) x while we initially assumed a G D x . ■ 

As an immediate consequence of the Lemmas 4.11, 4.13, and 4.14 together with Theo- 
rem 3.6 we get: 

Corollary 4.15 

The optimal algorithm is a non-deterministic decision procedure for satisfiability of ACCQ- 
concepts. 

4.3.2 Complexity of the Optimal Algorithm 

The optimal algorithm will enable us to prove Theorem 4.6. We will give a proof by 
sketching an implementation of this algorithm that runs in polynomial space. 

Lemma 4.16 

The optimal algorithm can he implemented in PSpace 

Proof. Let C be an ^4/jCQ-concept to be tested for satisfiability. We can assume C to be 
in NNF because the transformation of a concept to NNF can be performed in linear time. 

The key idea for the PSpace implementation is the trace technique (Schmidt-SchauB 
& Smolka, 1991) we have already used for the ^4CC-algorithm in Section 3.2.2, and which is 
based on the fact that it is sufficient to keep only a single path (a trace) of in memory 
at a given stage if A is generated in a depth-first manner. This idea has been the key 
to a PSpace upper bound for K m and ACC in (Ladner, 1977; Schmidt-SchauB & Smolka, 
1991; Halpern & Moses, 1992). To do this we need to store the values for §R A (x,D) for 
each individual x in the path, each R that appears in clos(C), and each D G clos(C). By 
storing these values in binary form, we are able to keep information about exponentially 
many successors in memory while storing only a single path at a given stage. 

Consider the algorithm in Fig. 4.5, where NR C denotes the set of role names that 
appear in clos{C). It re-uses the space needed to check the satisfiability of a successor 
y of x once the existence of a complete and clash-free "subtree" for the constraints on y 
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has been established. This is admissible since, as was the case for ACC, the optimal rules 
will never modify this subtree once it is completed. Constraints in this subtree also have 
no influence on the completeness or the existence of a clash in the rest of the tree, with 
the exception that constraints of the form y : D for i?-successors y of x contribute to the 
value of $R A (x, D). These numbers play a role both in the definition of a clash and for the 
applicability of the — >>-rule. Hence, in order to re-use the space occupied by the subtree 
for y, it is necessary and sufficient to store these numbers. 



Figure 4.5 A non-deterministic PSpace decision procedure for ACCQ. 

ACCQ-Sky{C) := sat(a;o, {x : C}) 
sat (a;, A): 

allocate counters §R A (x, D) : = for all R G NR C and D G clos(C). 
while (the — > n - or the ^ u -rule can be applied) and (A is clash-free) do 
apply the — -> n - or the ^ u -rule to A. 

od 

if A contains a clash then return "not satisfiable" . 

while (the ^>-rule applies to a constraint x : (>n R D) E A) do 

A new :={(x,y):R,y:D,y:D 1 ,...,y: D k } 

where 

y is a fresh individual, 

{Ei, . . . , E k } = {E | x : (txtm RE) G A}, and 
Di is chosen non-deterministically from {Ei, ~Ei] 
for each y : E G A new do increment §R A (x, E) 

if x : (<m R E) G A and ^R A (x, E) > m then return "not satisfiable". 
if sat(y, A new ) = "not satisfiable" then return "not satisfiable" 
discard Anew from memory 

od 

discard the counters for x from memory, 
return "satisfiable" 



Let us examine the space usage of this algorithm. Let n = \C\. The algorithm is 
designed to keep only a single path of in memory at a given stage. For each individual 
x on a path, constraints of the form x : D have to be stored for concepts D G clos(C). 
The size of clos(C) is bounded by 2n and hence the constraints for a single individual can 
be stored in 0(n) bits. For each individual, there are at most |NRc| x \clos(C)\ = 0(n 2 ) 
counters to be stored. The numbers to be stored in these counters do not exceed the out- 
degree of x, which, by Lemma 4.10, is bounded by \clos(C)\ x 2' C 'L Hence each counter 
can be stored using 0(n 2 ) bits when binary coding is used to represent the counters, and 
all counters for a single individual require 0(n A ) bits. Due to Lemma 4.10, the length of a 
path is limited by n, which yields an overall memory consumption of 0(n 5 +n 2 ) = 0(n 5 ). 
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Theorem 4.6 now is a simple Corollary from the PSPACE-hardness of ACC, Lemma 4.16, 
and Savitch's Theorem (Savitch, 1970). 

4.4 Extensions of ACCQ 

It is possible to augment the DL ACCQ without loosing the PSPACE property of the con- 
cept satisfiability problem. In this section we extend the techniques to obtain a P Space 
algorithm for the logic ACCQLb, which extends ACCQ with inverse roles and safe Boolean 
combinations of roles. This extends the results from (Tobies, 2001) for the modal logic 
Gr(K^-i), which corresponds to ACCQ extended with inverse roles and intersection of roles. 

Definition 4.17 (Syntax of ACCQSb) 

Let NC be a set of atomic concept names and NR be a set of atomic role names. With 
NR := NR U {R- 1 | R G NR} we denote the set of ACCQEb-roles. 

A role S of the form S = R~ x with R G NR is called inverse role. 

An ACCQJb-role expression u is built from ACCQUb-roles using the operators n (role 
intersection), U (role union), and -i (role complement), with the restriction that, when 
transformed into disjunctive normal form, every disjunct contains at least one non-negated 
conjunct. A role expression that satisfies this constraint is called safe. 

The set of ^lGCQZ&-concepts is built inductively from these using the following grammar, 
where A G NC, u is an ACCQCb-role expression, and n G N: 

C ::= A | \ C x n C 2 | C x U C 2 | (<n uj C) \ (>n to C). 

ACCQE is the fragment of ACCQJJb, where every role expression in a number restriction 
consists of a single (possibly inverse) ACCQJh-role. o 

The role-expressions -■(-■-Ri U (i^ 1 n -._R 3 )) U (^R 2 n i^ 1 ) is safe (its DNF is (i?i n 
^i?2 X ) u (Ri n ^s) u (^^2 l~l R2 1 )) while R U ~^R is not an ACCQEb role expression since 
it is already in DNF and ->R occurs as single element in one of the disjuncts. The latter 
example also shows that some kind of restrictions on role expressions is indeed necessary 
if we want to obtain a PSpace algorithm: the concept (<0 R U ->R ->C) is satisfiable 
iff C is globally satisfiable, which is an ExpTlME-complete problem (see the proof of 
Theorem 3.18. Indeed, for unrestricted role expressions, the problem in the presence of 
qualifying number restrictions is of even higher complexity. It is NExpTlME-complete (see 
Corollary 5.34). 

The syntactic restriction we have chosen enforces that, for a pair (x,y) to appear in 
the extension of a role expression uj, they must occur at least in the extension of one of the 
roles that occur in uj. Hence, if no role relation holds between x and y, concepts asserted 
for x do not impose any restrictions on y. 

A similar restriction can be found in the database world in conjunction with the notion 
of safe-range queries (Abiteboul, Hull, & Vianu, 1995, Chapter 5). To decide whether a 
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role expression uj is safe, it is not necessary to calculate its DNF (which might require 
exponential time). One can rather use the following algorithm: first, compute the NNF u/ 
of uj by pushing negation inwards using de Morgan's law. Second, test whether safe(u>') 
holds, where the function safe is defined inductively on the structure of role expressions as 
follows (compare Abiteboul et al., 1995, Algorithm 5.4.3): 

safe(R) = true for R G NR 
safe(^R) = false for R G NR 

safe(cui n uj 2 ) = safe(ui) V sa/e(c^ 2 ) 
safe(cui U uj 2 ) = safe(cui) A safe(u>2) 

It is easy to see that a role expression is safe iff this algorithm yields true. Hence, a role 
expression can be tested for safety in polynomial time. 

The semantics of „4CC<2-concepts can be extended to *4/jCQZ&-concepts by fixing the 
interpretations of the role expressions. This is done in the obvious way. 

Definition 4.18 (Semantics of ACCQEb) 

For an interpretation I = (A 1 , the semantics of ACCQZb-concepts is defined inductively 
as for ACCQ-concepts with the additional rules: 

(<n uj C f = {x G A 1 | | (x, y) G uo x and y G C 1 } < n}, 
(>n uj C f = {x G A 1 | | (x, y) G uj 1 and y G C 1 } > n}, 

where the interpretation of a role expression uj is obtained by extending the valuation X 
inductively to role expressions by setting: 

R~ 1 = {(y,x)\(x,y)eR 1 }, 
(^ujf = (A 1 x A 1 ) \ uj 1 , 

(ui n UJ2) 1 = ujf n uj?, 

(uJi U UJ2) 1 = uj\ U uj\ . 

o 



Obviously every ACCQ concept is also a ACCQLb concept. We will use the letters uj, o to 
range over ^4/jCQZ6-role-expressions. To avoid dealing with roles of the form (i? -1 ) -1 we use 
the convention that (i? -1 ) -1 = R for any R G NR. This is justified by the semantics. The 
definition of NNF and clos(-) can be extended from ACCQ to ACCQLb in a straightforward 
manner. Moreover, we use the following notation: 

Definition 4.19 

Let R a set of (possibly inverse) roles and uj a role expression. We view then roles in NR 
as propositional variables and R as the propositional interpretation that maps exactly the 
elements of R to true and all other roles to false. We write R |= uj iff uj, viewed as a 
propositional formula, evaluates to true under R. o 
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The intended use of this definition is captured by the following simple lemma: 
Lemma 4.20 

Let X be an interpretation, x, y G A 1 and u a role expression. 

(x, y) G to 1 iff {R G NR | (x, y) G R 1 } |= u. 
For two individuals x, y in an ABox A and a role expression u, 

{R | (x, y) : R G A} |= uj 
implies (x,y) G uj Ia for the canonical interpretation X4. 

4.4.1 Reasoning for ACCQDj 

We will use similar techniques as in the previous section to obtain a PSPACE-algorithm for 
ACCQJJb. We still use ABoxes to capture the constraints generate by completion rules, with 
the only change that we allow inverse roles to appear in role assertions and require that, 
for any R G NR, an ABox contains the constraint (x,y) : R iff it contains the constraint 
(y,x) : R' 1 . For an ABox A, a role-expression uj, and a concept D, let §uj a (x,D) be the 
number of individuals y such that {R | (x, y) : R G A} |= uj and y : D G A. Due to the 
syntactic restriction on role expressions, an individual y may only contribute to $cj a (x, D) 
if (x, y) : R G A for some (possibly inverse) role R that occurs in uj. 

Algorithm 4.21 (The ^LGCQa-algorithm) 

We modify the definition of clash to deal with safe role expressions as follows. An ABox 
A contains a clash iff 

• {x : A, x : ->A} C A for some individual x and A G NC, or 

• x : (<n uj D) G A and §u A (x, D) > n for some individual x, role expression uj, 
concept D, and n G N. 

The set of rules dealing with ACCQLb is shown in Figure 4.6. The algorithm maintains 
a binary relation between the individuals in an ABox A with x y iff y was inserted 
by the — >>-rule to satisfy a constraint for x. When considering the graph G^, the relation 

corresponds to the successor relation between nodes. Hence, when x -<a y holds we 
will call y a successor of x and x a predecessor of y. We denote the transitive closure of 
-<a by 

For a set of individuals X and an ABox A, we denote the subset of A in which no 
individual from X occurs in a constraint by A — X . The — > n -, — > u - and — > c i loosc -rule are 
called non-generating rules while the — >>-rule is called a generating rule. 

Let C be an ACCQZb-concept in NNF and NR C the set of roles that occur in C together 
with their inverses. To test the satisfiability of C, the ACCQTb-algorithm starts with the 
initial ABox {xq : C} and successively applies the rules from Figure 4.6. stopping when a 
clash occurs or the — >>-rule fails. The algorithm answers "C is satisfiable" iff the rule can 
be applied in a way that yields a complete ABox. o 
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Figure 4.6 The completion rules for ACCQJb. 
->n, ^u: see Fig. 4.1 

- * choose : if 1. x : (con wD) Gi and 

2. for some R that occurs in u there is a y with (x, y) : R G .4, and 
{y : D, y : n A = 
then A ^choose -4' U {y : E} where £ G {D, ~L>} 
and A' = A - {z | y -<\ z) 
^>: if 1. x : (>n D) G ^4, and 

2. ftur^x,!}) < n, and 

3. no non-generating rule can be applied to a constraint for x 
then guess a set R = {R±, . . . , R m } C NRc 

if R ^ a; then /aiZ 

else .4 ^> A. U {y : D} U ^4' U A" and set x y where 
.A' = {y : D u . . . , y : L> fe }, A e {^i, ~^}, and 

{Ex, . . . , = {£ | x : (txm a E) eS} 
A" = {(x, y):Ri, (y, x) : i?^ 1 , . . . , (x, y) : R m , (y, x) : R' 1 } 
y is a fresh individual 



For the different kinds on non-determinism present in this algorithm, compare the 
discussion below Algorithm 3.2. Similar to the case for AXQ, it is shown in the proof of 
Lemma 4.25 that the choice of which rule to apply when is don't-care non-deterministic. 
This implies that one is free to choose an arbitrary strategy that decides which rule to 
apply if more than one is applicable. 

For the different kinds of non-determinism present in the A/jCQZZ>-algorithm, refer to 
the discussion below 

The — »>-rule, while looking complicated, is a straightforward extension of the — »>-rule 
for AXQ, which takes into account that we need to guess a set of roles between the old 
individual x and the freshly introduced individual y such that these roles satisfy the role 
expression uj currently under consideration. The — > c h 00 sc- r ule requires more explanation. 

For ACCQ, the optimal algorithm generates an ABox A in a way that, whenever x : 
(txin R D) E A, then, for any y with (x, y) : R G A, either y : D or y : ~D G A. This 
was achieved by suspending the generation of any successors y of x until A contained all 
constraints of the from x : D it would ever contain. In the presence of inverse relations, 
this is no longer possible because y might have been generated as a predecessor of x and 
hence before it was possible to know which concepts D might be relevant. There are at 
least two possible ways to overcome this problem. One is, to guess, for every x and every 
D G clos(C), whether x : D or x : ~-D. In this case, since the termination of the optimal 
algorithm as proved in Lemma 4.11 relies on the fact that the nesting of qualifying number 
restrictions strictly decreases along a path in the induced graph G^, termination would no 
longer be guaranteed. It would have to be enforced by different means. 
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Here, we use another approach. We can distinguish two different situations where 
{x : (ixm uj D), (x, y) : R} C A for some R that occurs in u, and {y : D,y : ~D} fl A — 0: 
1/ is a predecessor of x (y x) or a successor of x (x -<a y). The second situation will 
never occur. This is due to the interplay of the — >>-rule and the — > c hoose-rule. The former 
is suspended until all known relevant information has been added for x, the latter deletes 
certain parts of the ABox whenever new constraints are added for predecessor individuals. 

The first situation is resolved by non-deterministically adding either y : D or y : ~Z) 
to A. The subsequent deletion of all constraints involving individuals from {z | y z}, 
which correspond to the deletion of all subtrees of G_a rooted below y, is necessary to make 
this rule "compatible" with the trace technique we want to employ in order to obtain a 
PSPACE-algorithm. The correctness of the trace approach relies on the property that, once 
we have established the existence of a complete and clash-free "subtree" for a node x, we 
can remove this tree from memory because it will not be modified by the algorithm. In 
the presence of inverse roles this can be no longer taken for granted as can be illustrated 
by the concept 

C = (<0 i?i B) n (>1 Ri A U B) n (>1 R 2 (<0 Rz 1 (>1 R x A))). 

Figure 4.7 shows the beginning of a run of the ^4CCQZ6-algorithm. After a number of steps, 
a successor y of x has been generated and the expansion of constraints has produced a 
complete and clash-free subtree for y. Nevertheless, the concept C is not satisfiable. The 
expansion of (>1 R 2 (<0 R^ 1 (>1 Ri A))) will eventually lead to the generation of the 
constraint x : ~(>1 Ri A) = (<0 i?i A) in A 5 , which disallows i?i-successors that satisfy 
A. This conflicts with the constraints x : (<0 i?i B) and x : (>1 R\ AU B) , which 
require a successor of x that satisfies A. Consider an implementation of the algorithm that 
employs tracing: the ABox ^,3 contains a complete and clash-free subtree for y, which is 
deleted from memory and it is recorded that the constraint x : (>1 R\ A U B) has been 
satisfied and this constraint is never reconsidered — the conflict goes undetected. To make 
tracing possible, the — > c h 00 se- r ule deletes all information about y when stepping from A4 
to A5, which, while duplicating some work, makes it possible to detect this conflict even 
when tracing through the ABox. An implementation that uses tracing can safely discard 
the information about y from memory once the existence of a complete and clash-free 
subtree has been established in A3 because, whenever the effect of an application of the 
^choosc-rule might conflict with assertions for a successor y, all required successors of x 
have to be re-generated anyway. 

A similar technique will be used in a subsequent chapter to obtain a PSPACE-result for 
another DL with inverse roles. 



4.4.2 Correctness of the Algorithm 

Like for ACCQ, we show correctness of the ^4/jCQZ6-algorithm along the lines of Theorem 3.6. 
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Figure 4.7 Inverse roles make tracing difficult. 



{x : C} ^ n • • • 

{x : C,x : (<0 i?i : (>1 i?iAUB),i: (>1 R 2 (<0 R? 1 (>1 i?i A)))} 

N v ' 

Al 

^> Ai U {(x, y):R 1 ,(y,x):Ri 1 ,y.AUB,y. ^B} ^ u „4 2 U {y : A} 
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A 3 U {(a;, z) : R 2 , (z, x) : R 2 \ z : (<0 R 2 l (>1 i?i A))} 
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Termination 

Obviously, the deletion of constraints in A makes a new proof of termination necessary, 
since the proof of Lemma 4.11 relied on the fact that constraints were never removed from 
the ABox. Note, however, that the Lemma 4.10 still holds for ACCQJb. 

Lemma 4.22 (Termination) 

Any sequence of rule applications starting from an ABox A = {x : C} of the ACCQJb 
algorithm is finite. 

Proof. The sequence of rule applications induces a sequence of trees. As before, the depth 
and out-degree of this tree is bounded in |C| by Lemma 4.10. For each individual x, L(rr) 
is a subset of the finite set clos(C). Each application of a rule either 

• adds a constraint of the form x : D and hence adds an element to L(x), or 

• adds fresh individuals to A and hence adds additional nodes to the tree G^, or 

• adds a constraint to a node y and deletes all subtrees rooted below y. 

Assume that algorithm does not terminate. Due to the mentioned facts this can only 
be because of an infinite number of deletions of subtrees. Each node can of course only 
be deleted once, but the successors of a single node may be deleted several times. The 
root of the constraint system cannot be deleted because it has no predecessor. Hence there 
are nodes that are never deleted. Choose one of these nodes y with maximum distance 
from the root, i.e., which has a maximum number of ancestors in -<^. Suppose that y's 
successors are deleted only finitely many times. This can not be the case because, after the 
last deletion of y's successors, the "new" successors were never deleted and thus y would 
not have maximum distance from the root. Hence y triggers the deletion of its successors 
infinitely many times. However, the — » c hoosc-rule is the only rule that leads to a deletion, 
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and it simultaneously leads to an increase of L(y), namely by the missing concept which 
caused the deletion of y's successors. This implies the existence of an infinitely increasing 
chain of subsets of clos{C), which is clearly impossible. ■ 



Soundness and Completeness 

We start by proving an important property of the interplay of the -^>-rule and the — > c hoose- 
rule. 

Lemma 4.23 

Let Ai,A2,A% be ABoxes generated by the ACCQJb-algorithm, such that A<i is derived 
from A\ by application of the — >>-ru!e to an individual x in a way that creates the new 
successor y of x, and A3 is derived from A2 by zero or more rule applications. If both 
x,y occur in A3, then {D \ x : D G A\} = {D \ x : D G .A3} and the — > c i oose -ruIe is not 
applicable to x in A3 in a way that adds a concept assertion for y. 

Proof. Assume that x,y occur in A3. Then they also occur in all intermediate ABoxes 
because, once an individual is deleted from the constraint system, it is never re-introduced. 
The proof is by induction on the number of rule applications necessary to derive A3 from 
A<i- If no rule must be applied, then A2 = A3 holds, and since application of the — >>-rule 
to x does not alter the concepts asserted for x, we are done. Now assume that the lemma 
holds for every ABox A' derivable from A2 by n rule applications. 

Let A 3 be derivable from A 2 in n + 1 steps and let A' be an ABox such that A 2 -^ n 
A' — > A3. Since {D \ x : D G A±} = {D | x : D G A'} holds by induction, also 
{D I x : D G Ai} = {D \ x : D G ^3} holds as long as the rule application that derives A3 
from A! does not alter the concepts asserted for x. 

The — >>-rule does not alter the constraints for any individual that is already present 
in the ABox because it introduces a fresh individual. 

The — >rr or — >u-rule cannot be applicable to x because, if the rule is applicable in A! , 
then, since {D \ x : D G Ai} = {D \ x : D G ^4'}, it is also applicable in A\ and the — >>- 
rule that creates y is not applicable. Assume that an application of the — > c hoose-rule asserts 
an additional concept for x. Any application of the ^ choose -rule that adds a constraint for 
x removes the individuals {z \ x -<^' z } from A'. This includes y and hence y would not 
occur in A3, in contradiction to the assumption that x, y occur in „4 3 . 

Since the concept assertions for x have not changed since the generation of y, it holds 
that x : (cxin u D) e A3 iS x : (cxin u D) G A\ and so {y : D,y : ~L>} fl A\ is 
ensured by the — »>-rule that creates y. The individual y still occurs in A3 and hence 
{y : D,y : ~L>} fl „4 3 holds, which implies that the ^ c h 00 sc- r ule cannot be applied for the 
constraint x : (cxin uj D) G ^3 in a way that adds y : D or y : ~L> to „4 3 . ■ 

The correctness of the ^4/jCQZ6-algorithm is again proved along the lines of Theorem 3.6, 
but in a slightly different manner than it was proved for ACCQ. Instead of proving local 
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correctness of the rules, which is difficult to establish due to the deletion of constraints by 
the — > c hoosc- r ule, we use Property 5'. Additionally, we require a stronger notion of satisfia- 
bility than standard ABox satisfiability. Similar as for ACCQ, we define the differentiation 
A of an ABox A by setting 

A = AU{y^z\ {(x,y) : R, (x,z) : S} C A,y^z}. 

Note the slight difference to the definition of ACCQ, where only those individuals reachable 
from x via the same role R were asserted to be distinct. Here, all individuals reachable from 
x via an arbitrary role are asserted to be distinct. We say that an ABox A is satisfiable iff 
there exists a model X of its differentiation A that, in addition to what is required by the 
standard notion of ABox satisfiability from Definition 2.5, satisfies: 

if (x, y) : R G A then {R | (x,y) : R G A} = {R | (x 1 , /)) G R 1 } n NR C . (§) 

Note that this additional property is trivially satisfied by a canonical interpretation. 

Obviously, Properties 1 and 2 of Theorem 3.6 hold for every ABox generated by the 
^l/jCQZ6-algorithm. 

Lemma 4.24 (Soundness) 

Let A be a complete and clash-free ABox generated by the ACCQLb-algorithm. Then A is 
satisfiable, i.e., there exists a model I of A that additionally satisfies (§). 

Proof. Let A be a complete and clash-free ABox obtained by a sequence of rule appli- 
cations starting from {x : C}. We show that the canonical interpretation 1^ (as defined 
in Definition 3.7) is indeed a model of A that satisfies (§). Please note that we need the 
condition u (x,y) : R G A iff (y, x) : R^ 1 G A", which is maintained by the algorithm, to 
make sure that all information from the ABox is reflected in the canonical interpretation. 

Every canonical interpretation trivially satisfies (§) and also every two different indi- 
viduals are interpreted differently, which takes care of the additional assertions in A. So, it 
remains to show that x : D G A implies x G D Ia for all individuals x in A and all concepts 
D G clos(C). This is done by induction over the norm of concepts || • || . The only interesting 
cases that are different from the ACCQ-case are the qualifying number restrictions. 

• x : (>n uj D) G A implies uj a (x, D) > n because A is complete. Hence, there are n 
distinct individuals y±, . . . , y n with y t : D G A and {R \ (x, yi) : R G ^4} |= uj for each 
1 < i < n. By induction and Lemma 4.20, we have y^ G D Ia and (x,yi) G uj Xa and 
hence x G (>n uj D) Ia . 

• x : (<n uj D) G A implies, for any R that occurs in uj and any y with (x, y) : R G A, 
y : D G A or y : ~D G A. For any predecessor of x, this is guaranteed by the 
^choosc-rule. For any successor, this follows from Lemma 4.23. Hence, x : (<n uj D) 
is present in A by the time y is generated and the ^>-rule ensures y : D G A or 
y : ~L> G A. 
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We show that %?*(x, D) < $cu A (x, D): assume $u Xa (x, D) > $u A (x, D). This implies 
the existence of some y with (x, y) G uj Xa and y G D Xa but y : D G" A. Due to the 
syntactic restriction on role expressions, (x, y) G uj Xa implies (x, y) G R Ia for some R 
that occurs in a; and and hence (x, y) : R G A must hold by construction of X4. The 
^choosc-rule and the — >>-rule then guarantee that y : D G" A implies y : ~£> G A 
By induction this yields y G (~D) X - 4 in contradiction to y G D 1 - 4 . ■ 

Lemma 4.25 (Local Completeness) 

If A is a satisfiable ABox generated by the ACCQfib-algorithm and a rule is applicable to 
A, then it can be applied in a way that yields a satisfiable A'. 

Proof. Let I he a model of A that satisfies (§), as required by our notion of satisfiability. 
We distinguish the different rules. For most rules X can remain unchanged, in all other 
cases we explicitly state how X must be modified in order to witness the satisfiability of 
the modified ABox. 

• The ^n-rule: if x : d n C 2 G A, then x x G (Ci n C 2 ) x ■ This implies x x G Cf for 
i = 1,2, and hence satisfiability is preserved. 

• The ^ u -rule: if x : Ci U C 2 G A, then x x G {C 1 U C 2 ) X ■ This implies x 1 G Cf or 
x 1 G C x . Hence the — > u -rule can add a constraint x : D with D G {Ci,C 2 } and 
maintains satisfiability. 

• The ^ c hoose- r ule: obviously, either y x G D x or y x G" -D 1 for any individual y in A. 
Hence, the rule can always be applied in a way that maintains satisfiability. Deletion 
of constraints as performed by the — > c h 00 se-rule cannot cause unsatisfiability. 

• The ^>-rule: if x : (>n u D) G A, then x x G (>n u D) x . This implies §u x (x x , D) > 
n. We claim that there is an element a G A x such that 

(x x , a) G uj x , a G -D 1 , and , . 

a ^ {z x \ (x,z) : S E A for some S G NR C }. 

We will prove this claim later. Let E±, . . . , Eh be an enumeration of the set {£" | x : 
(txim a E) E A}. The ^>-rule can add the constraints 

A' = {y : Ei\a e E x } U {y : ~Ei \ a ? E x } 

A" = {(x,y) : R\ Re NR C , (x x , a) G U {(y, x) : i? | i? G NR C , (a, x x ) G i? 1 } 

as well as {y : D} to ^4. If we set X' := X[y 1— > a], then X' is a model of the 
differentiation of the ABox obtained this way that satisfies (§). 

Why does there exists an element a that satisfies (*)? Let b G A 1 be an individual 
with (x x , b) G uj x and b G D x that appears as an image of an arbitrary element z 
with (x, z) : S G A for some S G NR C . The requirement (§) implies that {R \ (x, z) : 
R e A} \= lu and also z : D G A must hold. This can be shown as follows: 



58 



Chapter 4. Qualifying Number Restrictions 



Assume z : D ^ A. This implies z : ~D G ^4: either z -<a x, then in order for the 
^>-rule to be applicable, no non-generating rules and especially the ^ c h 00 se- r ule is 
not applicable to x and its ancestor, which implies {z : D, z : ~D} H A ^ 0. If not 
z x, then z must have been generated by an application of the — >>-rule to x. 
Lemma 4.23 implies that at the time of the generation of z already x : (>n uj D) G A 
held and hence the — >>-rule ensures {z : D, z : ~L>} fl A ^ 0. 

In any case z : ~£> G .4. holds, which implies b £ D 1 , in contradiction to b G D 1 . 

Together this implies that, whenever an element b with (x 1 , b) G uo 1 and b G -D x 
is assigned to an individual z with (x, z) : S E A, then it must be assigned to an 
individual that contributes to §cu A (x, D). Since the — »>-rule is applicable, there are 
less than n such individuals and hence there must be an unassigned element a as 
required by (*). ■ 

The — > c hoose- r ule deletes only assertions for successors of a node and hence never deletes 
any assertions for the root x . Hence, for any ABox A generated by application of the 
completion rules from an initial ABox {x : C}, {x : C} C A holds and hence we get the 
following. 

Lemma 4.26 

If a complete and clash-free ABox A can be generated from an initial ABox Aq, then Aq 
is satisfiable. 

Proof. From Lemma 4.25, it follows that A is satisfiable and every model of A is also a 
model of Ao = {x : C} because Ao C A and Aq contains no role assertions, which implies 
Aq = Aq and every interpretation trivially satisfies (§) for Aq. ■ 

Hence, we can apply Theorem 3.6 and get: 
Corollary 4.27 

The ACTQUb- algorithm is a non-deterministic decision procedure for satishability of ACCQJJb- 
concepts. 

Proof. Termination has been shown in Lemma 4.22. As mentioned before, Property 1 
and 2 of Theorem 3.6 are trivially satisfied due to the chosen notion of ABox satisfiability. 
Property 3 has been shown in Lemma 4.24, Property 4 in Lemma 4.25 and Property 5' in 
Lemma 4.26. ■ 



4.4.3 Complexity of the Algorithm 

Like for the optimal algorithm for ACCQ, we have to show that the ^UTQZ6-algorithm can 
be implemented in a way that consumes only polynomial space. This is done similarly to 
the ACCQ-case, but we have to deal with two additional problems: we have to find a way to 
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implement the "reset-restart" caused by the -^ c h 00 sc-rule, and we have to store the values 
of the relevant counters cu A (x,D). It is impossible to store the values for every possible 
role expression u because there are exponentially many inequivalent of these. Fortunately, 
storing only the values for those uj that actually appear in C is sufficient. 

Lemma 4.28 

The ACCQJh- algorithm can he implemented in PSpace. 

Proof. Consider the algorithm in Figure 4.8, where tie denotes all role expressions that 
occur in the input concept C . Like the algorithm for A£CQ, the ^lGCQZ6-algorithm re-uses 
the space used to check for the existence of a complete and clash-free "subtree" for each 
successor y of an individual x and keeps only a single path in memory at one time. Counter 
variables are used to keep track of the values §u A (x, D) for all uo eVl c and D e clos(C). 

Resetting a node and restarting the generation of its successors is achieved by jumping 
to the label restart in the algorithm, which re-initializes all successor counters for a node 
x. Note, how the predecessor of a node is taken into account when initializing the counter 
individuals. Since G^ is a tree, every newly generated node has a uniquely determined 
predecessor and since only safe role expressions occur in Q c , h is sufficient to take only 
this predecessor node into account when initializing the counter. 

Let n — \C\. For every node a; of a path in Ga, 0{n) bits suffice to store the constraints 
of the form x : D and 0(n 4 ) suffice to store the counters (in binary representation) because 
Jjf2c = 0(n), jjdos(C) = 0(n), and the out-degree of GU is bounded by 0{n) x 2 n (by 
Lemma 4.10, which also holds for ADZQDj). Also by Lemma 4.10, the length of a path 
in Gj( is bounded by 0(n), which yields an overall memory requirement of 0{n b ) for a 
path. ■ 

Obviously, satisfiability of ^4/jCQZ&-concepts is PSPACE-hard, hence Lemma 4.28 and 
Savitch's Theorem (Savitch, 1970) yield: 

Theorem 4.29 

Satishability of ACCQJh-concepts is PSPACE-compIete if the numbers in the input are rep- 
resented using binary coding. 

As a simple corollary, we get the solution of an open problem in (Donini et al., 1997): 
Corollary 4.30 

Satishability of ACCNlZ-concepts is P Sp ACE-complete if the numbers in the input are 
represented using binary coding. 

Proof. The DL ACCAflZ is a syntactic restriction of the DL ACCQJb, where we do not allow 
for inverse roles and in number restrictions (cxin u D), uj must be a conjunction of positive 
roles and D the tautological concept T. Hence, the ^4/jCQZ6-algorithm can immediately be 
applied to ^LGC/V'7?.-concepts, which yields decidability in PSpace. ■ 
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Figure 4.8 A non-deterministic PSpace decision procedure for A/jCQZ5-satisfiability. 



^TQa-SAT(C) := sat(x , {x : C}) 
sat(x, S): 

allocate counters §ui A (x, D) for all uj G fic* and D G clos{C). 
restart: 

for each counter §u A (x,D): 

if (x has a predecessor y x with {R \ (x,y) : i? G .4.} |= and y : D £ A) 
then (Jw^D) := 1 else tfur^x, L>) := 
while (the — > n - or the ^ u -rule can be applied at x) and (A is clash-free) do 

apply the — » n - or the ^ u -rule to A. 

od 

if A contains a clash then return "not satisfiable". 

if the — > c hoose- r ule is applicable to the constraint x : (con uj D) <E A 

then return "restart with D" 
while (the ^>-rule applies to a constraint x : (>n u D) e A) do 

non-deterministically choose RC NR C 

if R ^ uj then return "not satisfiable" 

Acw := {y ■ D} u A u .A" 

where 

y is a fresh individual 

{E u . . . , E k } = {E | x : (Mm a £7) G A} 

A = {y : D 1 , . . . ,y : £> fc }, and 

is chosen non-deterministically from {Ei, ~Ei} 

A" = {(x, y) : R, (y, x) : R~ l \ R G R} 
for each E with y : E E A and cr G fie with R |= o do 

increment (Jcr^x, E) 
if x : (<m a E) e A and (jcr^x, E) > m 

then return "not satisfiable". 
result := sat(y, A. U A. ncw ) 

if result = "not satisfiable" then return "not satisfiable" 
if result = "restart with D" then 
A := AU {x : E} 

where E is chosen non-deterministically from {D, ~.D} 
goto restart 

od 

discard the counters for x from memory, 
return "satisfiable" 
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4.5 Reasoning with ACCQJb- Knowledge Bases 

So far, we have only dealt with the problem of concept satisfiability rather than satisfiability 
of knowledge bases. In this section, we will examine the complexity of reasoning with 
knowledge bases for the DL ACCQUb. For the more "standard" DL ACCQL, this problem 
has been shown to be ExpTlME-complete by De Giacomo (1995), but this result does not 
easily transfer to ACCQJb because of the role expressions and the proof in (De Giacomo, 
1995) is only valid in case of unary coding of numbers in the input. Here, we are aiming 
for a proof that is valid also in the case of binary coding of numbers. 

In a first step, we deal with concept satisfiability w.r.t. general TBoxes and prove that 
this problem can be solved in ExpTime using an automata approach. ABoxes are then 
handled by a pre-completion algorithm similar to the one presented by Hollunder (1996) 
(see also Section 3.2.3). It should be mentioned that the algorithms developed in this 
section are by no means intended for implementation. They are used only to obtain tight 
worst-case complexity results. We are also very generous in size or time estimates. 

The lower complexity bound for ACCQJh with general TBoxes is an immediate conse- 
quence of Theorem 3.18 because ACC is strictly contained in ACCQJb. 

Lemma 4.31 

Satisfiability of ACCQJh-concepts (and hence of ABoxes) w.r.t. general TBoxes is ExpTime- 
hard. 

To establish a matching upper complexity bound, we employ an automata approach, 
where (un-) satisfiability of concepts is reduced to emptiness of suitable finite automata, 
usually Biichi word or tree automata (Thomas, 1992). This approach is a valuable tool to 
establish exact complexity results for DLs and modal logics (Vardi & Wolper, 1986; Lutz 
& Sattler, 2000), particularly for ExpTlME-complete logics, where tableau approaches — 
due to their non-deterministic nature — either fail entirely or require very sophisticated 
techniques (Donini & Massacci, 2000) to prove decidability of the decision problem in 
ExpTime. 

In general, the automata approach works as follows. To test satisfiability of a concept 
C w.r.t. a TBox T , an automaton 2lc,r is constructed that accepts exactly (abstractions 
of) models of C and T, so that 2lc,r accepts a non-empty language iff C is satisfiable w.r.t. 
T. For ACCQJb, we do not require the full complexity of Biichi tree automata — the simpler 
formalism of looping tree automata (Vardi & Wolper, 1994) suffices. 

Definition 4.32 (Looping Tree Automata) 

For a natural number n, let [n] denote the set {1, . . . , n}. An n-ary infinite tree over the 
alphabet E is a mapping t : [n\* — > E, where [n\* denotes the set of finite strings over [n\. 

An n-ary looping tree automaton is a tuple 21 = (Q, E, /, 5), where Q is a finite set of 
states, E is a finite alphabet, I C Q is the set of initial states, and 5 C Q x E x Q n is the 
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transition relation. Sometimes, we will view 5 as a function from Q x E to 2 Q ™ and write 
5(q, a) for the set of tuples {q | (q, a, q) E 5}. 

A run of 21 on an n-ary infinite tree t over £ is an n-ary inhnite tree r over Q such 
that, for every p E [n]*, 

(r(p),t(p), (r(pl), . . . , r{pn))) E 5. 

An automaton 21 accepts t iff there is a run r of 21 on t with r(e) E I. With L(2l) we 
denote the language accepted by 21 defined by 1/(21) := {t | 21 accepts t}. o 

For a looping automaton 21, emptiness of 1/(21) can be decided efficiently. 
Fact 4.33 

Let 21 = (Q, £, /, <5) be an n-ary looping tree automaton. Emptiness of L (21) can be decided 
in tuneO($Q + %6). 

A polynomial bound directly follows from the quadratic time algorithm for Biichi tree 
automata (Vardi & Wolper, 1986) of which looping tree automata are special cases. A 
closer inspection of this algorithm shows that one can even obtain a linear algorithm using 
the techniques from (Dowling & Gallier, 1984). For our purposes also the mentioned 
quadratic and really every polynomial algorithm suffices. 

Before we formally define %c,r we give an informal description of the employed con- 
struction of the automaton and the abstraction from an interpretation X to a tree T we 
use. Generally speaking, nodes of T correspond to elements of an unraveling of X. In the 
label of the node, we record the relevant (sub-) concepts from C and T that are satisfied 
by this element, and also which roles connect the element to its unique predecessor in 
X. This information has to be recorded at the node since edges of a tree accepted by a 
looping automaton are unlabelled. Hence, the label of a node is a locally consistent set of 
"relevant" concepts (as defined below) paired with a set of "relevant" roles. 

For now, we fix an „4£TQZ&-concept C in NNF and an ACCQDyTBo^ T. Let NRc,r be 
the set of role names that occur in C and T together with their inverse and 0,c,t the set 
of role-expressions that occur in C and T. The closure clos(C,T) of "relevant" concepts 
is defined as the smallest set X of concepts such that 

• C E X and NNF(^Ci U C 2 ) E X for every d C C 2 E T 

• X is closed under sub-concepts and the application of ~, the operator that maps 
ever concept C to NNF(^C). 

Obviously, §clos(C,T) = 0(\C\ + |T|) (compare Lemma 4.9). 
A subset $ C clos(C,T) is locally consistent iff 

• for every D E clos(C, T), $ n {D, ~L>} ^ and {D, ~D} £ $, 

• for every d C C 2 E T, NNF(-iCi U C 2 ) E $, 



• if d n C 2 E $ then {C u C 2 } C $, and 
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• if d U C 2 G $ then $ n {Ci, C 2 } ^ 0. 

The set of locally consistent subsets of clos{C,T) is defined by lc(C, T) = {$ C 
clos(C,T) | $ is I.e.}. Obviously for every element x in a model of T, there exists a 
set $ C lc(C, T) such that all concepts from $ are satisfied by x. 

It remains to describe how the role relationships in X are mapped to T. Unfortunately 
it is not possible to simply map successors in X to successors in T due to the presence 
of binary coding of numbers in number restrictions. A number restriction of the form 
(>n uj D) requires the existence of n successors, where n may be exponential in the size 
of C if numbers are coded binarily. In this case, the transition table of the corresponding 
automaton requires double-exponential space in the size of C and the automata approach 
would not yield the ExpTlME-result we desire. 

We overcome this problem as follows. Instead of using a A;-ary tree, where k somehow 
depends on the input C and T, we use a binary tree. Required successors U of an element 
s in X are not mapped to direct successors of the node corresponding to s but rather to 
nodes that are reachable by zero or more steps to the left and a single step to the right. 
The dummy label (*, *) is used for the auxiliary states that are reachable by left-steps only 
because these do not correspond to any elements of X. If n successors must be mapped, the 
subtree rooted n left-steps from the current node is not needed to map any more successors 
and hence is labelled entirely with (*,*). Figure 4.9 illustrates this construction, where 
<& x denotes the concepts from clos(C,T) that are satisfied by x and R x the set of roles 
connecting x with its predecessor. 

In order to accept exactly the abstractions of models generated by this transformation, 
it is necessary to perform additional book-keeping in the states. Since successors of the 
element s are spread through the tree, we must equip the states of Slc.r "responsible" for 
the auxiliary nodes with enough information to ensure that the number restrictions are 
"obeyed" . For this purpose, we use counters to record the minimal and maximal number of 
^-successors satisfying D that a node may have. This information is initialized whenever 
stepping to a right successor and updated when moving to a left successor in the tree. The 
counters are modelled as functions as follows. 

The maximum number n max (C, T) occurring in a qualifying number restriction in 
clos{C,T) is defined by n max (C, T) = max{n e N | (cxin uj D) e clos(C,T)} with 
max(0) := 0. 

The set of concepts that occur in number restrictions and hence must be considered at 
successor and predecessor nodes is defined by 

succ(C, T) = {D | (ixm uj D) e clos(C, T)}. 



In the automaton, we keep track of the numbers of witnesses for every occurring role 
expression uj e fic,r an d concept from succ(C, T). This is done using a set of limiting 
functions limit(C, T) defined by 

limit(C, T) := {/ | / : fi c ,r x succ(C, T) -> {0, . . . , n max (C, T), oo}}. 
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Figure 4.9 Transforming a model for C into a tree accepted by %c,r 




•ol :<*,*) •p2:($ tl) R tl > 



l"- 1 :(*,*) •pl2:<* ta) R t2 > 




!":<*,*} •pl™- 1 2: ($ tn ,R tn ) 




:(*,*} ,»pl n 2 :(*,*} 



The maximum/minimum number of allowed/required (^-successors satisfying a certain 
concept -D imposed by number restrictions in a set of concepts is captured by the functions 

min, max : lc(C, X) x NR cr x succ(C, T) — > {0, . . . , n max (C, X), oo} 

defined by 

max(<3>, u, D) = min{n | (<n uo D) G $} 
min($, a;, D) = max{n | (>n uo D) e $} 

with min(0) := oo. 

In the automaton 2lc,r, each state consists of a locally consistent set, a set of roles, and 
two limiting functions for the upper and lower bounds. There are three kinds of states. 

• states that label nodes of T corresponding to elements of the interpretation. These 
states record the locally consistent set $ labelling that node, the set of roles that 
connect the corresponding element to its unique predecessor in X and the appropriate 
initial values of the counters for this node — taking into account the concepts satisfied 
by the predecessor. This is necessary due to the presence of inverse roles. 

• states labelling nodes that are reachable from a node s corresponding to an element 
of X by one or more steps to the left. These states are marked by an empty set 
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of roles and record the locally consistent set labelling s to allow for the correct 
initialization of the counters for nodes corresponding to successors of s. Moreover, 
their limiting functions record the upper and lower bound of cj-successors of s still 
allowed/required. According to these functions, their right successor state "expects" 
a node corresponding to a successor of s and their left successor state a further 
auxiliary node. The limiting functions of this auxiliary state are adjusted according 
to the right successor. Once sufficiently many successors have been generated, the 
automaton switches to the following dummy state. 

• a dummy state (*, *, *, *), which reproduces itself and accepts a tree entirely labelled 
with (*, *). 

For a role R G NR^r, we define lnv(i?) by setting 



and for a set of roles R we define Inv(R) = {lnv(i?) | R G R}. We are now ready to 
define the automaton %c,r- 

Definition 4.34 

Let C be an ACCQEb-concept in NNF and T an ACCQDb-TBox. The binary looping tree 
automaton 2lc,r = {Q,Tt,I,§) for C and T is defined by 



/ = {($, NR CiT , £, h) G Q | C G $, £ = XujD. min($, u,D),h = XujD. max($, u, D)} 
5 C Q x £ x Q 2 , 

such that S is the maximal transition relation with ((*,*,*,*), (*,*), (*,*,*,*), (*,*,*,*)) £ 8 
and if (q , a, q u q 2 ) G 5 with q ^ (*, *, *, *) and q { = R i5 t h hi) then 

(Al) if R ^ then a = ($ , R ) else a = (*, *) 

(A2) if, for all uj G Qc,t and D G succ(C, T), £ (uj, D) = 0, then q 1 = q 2 = (*, *, *, *) 

(A3) otherwise, $ 2 G /c(C, T) and R 2 C NR^r such that there is auo G &c,t and aflG$ 2 
with R 2 |= us and loioj, D) > 0. As an auxiliary function, we define 




R- 1 if Re MR, 

S if R = S- 1 for some S G NR, 




NRc,r 



X 



//m/t(C, T) x limit(C, T) U {(*, *, *, *)} 
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and require, for all u e fic,r &nd D e clos(C, T), 

if max($ 2 ,cj,£)) = then e(<3> , lnv(R 2 ), a;, -D) = . , 

if h (iu,D) = then e($ 2 , R 2 , w, D) = 0. W 

Finally, $i = $ , Ri = and 

^ = -D) - e($ 2 , R 2 , £>), 

/ii = XioD.h (uJ, D) - e($ 2 , R 2 , £>), 

£ 2 = Acj£>. min($ 2 , cj, £>) - e($ , lnv(R 2 ), a;, D), and 

h 2 = XujD. max($ 2 , w, D) - e(<3> , lnv(R 2 ), u, D) 

must hold, where — denotes subtraction in N, i.e., x — y = max(0, x — y). o 

The choice of NRc^- as the role component of the initial states in / is arbitrary and 
indeed every non-empty set of could be used instead of URq,t- Note that the subtraction in 
the requirements for hi and h 2 never yields a negative value because of (*). Moreover, 2lc,r 
is small enough (i.e., exponential in the input) to be of use in our further considerations: 

Lemma 4.35 

Let C be a ACCQTb-concept in NNF, T an ACCQTb-TBox, m = \C\ + \T\, and 2l c ,r = 
(Q, E, I, 5) the looping tree automaton for C and T. Then 

%Q + $5 = 0(2 m5 ). 

Proof. The cardinality of \c(C,T) is bounded_by 2* clos{c ^ = 0(2 m ). The cardinal- 
ity of NRc,r is bounded by 2m and hence Jj2 Mc - r = 0{2 m ). Finally, the cardinal- 
ity of limit(C, T){f \ f : Qq,t x succ(C, T) — > {0, . . . , n max (C, T), oo}} is bounded by 
(n max (C,T) + 2)(^c,rxssucc(c,r)) = C )((2'»)'» 2 ) = 0{2 mA ), where 2 m is an upper bound for 
n max (C, T) if numbers are coded binarily in the input. Summing up, we get 0(2 m x2 m x 
2 m3 ) = C(2 m4 ) as a bound for $Q and C(2 m5 ) as a bound for (J5, which dominates (JQ. . 

We now show that emptiness of L(2tc , ,r) is indeed equivalent to unsatisfiability of C 
w.r.t. T. 

Lemma 4.36 

For an ACCQTb-concept C in NNF and a ACCQTb-TBox T, L(2l c , r ) ^ iff C is satishable 
w.r.t. T. 

Proof. Assume L(%I c ,t) 7^ 0, T is a tree accepted by 2lc,r, and r is an arbitrary run of 
2lc,r on with r(e) G J. From T, we will construct a model X = (A J , • J ) for C and T, 
which proves satisfiability of C w.r.t. T. For every path p e {1, 2}* with r(p) = ($, R, £, /i), 
we define <3> p := $, R p := R, £ p := £, and /i p := h. 
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The domain A 1 oil is denned by A 1 = {p £ {1,2}*2 | T(p) ^ (*,*)} U {e}. Hence, 
A 1 contains only "right successors" and the root. For concept names A, we define 

A 1 = {p £ A 1 | A £ $ p }. 

For the interpretation of roles, we define 

r 1 ={(p,p') e A 1 x A 1 1 p' e pl*2, R e R p <} u 

{(p',p) eA I xA I \p'e pl*2,Re Inv(Ry)}- 

Before we prove that X is indeed a model for C and T, we state some general properties 
of the automaton and this construction. 

(Rl) Due to the construction of A x , for every p £ A x , R p ^ and hence r(p) ^ (*, *, *, *). 

(R2) "Once (*,*,*,*), always (*,*,*,*)." For a path p £ {1,2}*, if r(p) = (*,*,*,*), 
then, for all p' with p' £ 2}*, T(p') = (*, *) and r(p') = (*, *, *, *). 

(R3) "A left successor is either (*, *, *, *) or an auxiliary state, in which case it is labelled 
with the same set from lc(C, T)." For a path p £ {1, 2}*, if r(p) = ($, R, £, h), then, 
for all p' £ pi*, if r(p') ^ (*, *, *, *) then r(p') is of the form r(p') = ($, 0, £', h'). 

(R4) "/i and £ are a lower an upper bounds on the number of successors of a node." For 
a path p £ {1, 2}* with T(p) ^ (*,*), u £ Qc,r, and L> £ succ(C, T), 

£ p (cj, £>) < £ pl*2 \R p ,^u,De %,} < /ip(w, -D). 
This property is less obvious than the others and we give a proof by induction on 

IHI= E ^ D )- 

ujen c ,r,Desucc(c,T) 

If ||p|| = 0, then £ p (uj,D) = for all uo £ flc,T and .D £ succ(C, T) and hence, by 
(A2) and (R2), for all ancestors p' £ pl*2 of p, r(p') = (*, *, *, *) and T(p') = (*, *). 
Thus 

= £ p {uo, D) = ${p' £ pl*2 | R p / |= w, D £ 5y } < D) 
holds for all c<j £ VL Cy r and D £ succ(C, T). 

If > then there is an uo £ VL Ct r and a D £ succ(C, T) with £ p (uo,D) > 0, 
R p2 |= cj, and D £ $ p2 . Hence, £ pl (uj,D) = £ p (uo,D) - 1 and ||pl|| < ||p|| by (A3) 
and we can use the induction hypothesis for pi. 
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For all uj E fic,r and D E succ(C, T), 

£ p (uj, D) < £ pl (uj, D) + e($p 2 , R p2 , D) 

< w (J{p' G pll*2 \R p/ \= uj,D e + tt{p' G p2 I R p , |= D E 
= Hp' e pl*2 I R p / |= u, D e %,} 

= ${p' e p\V2 \R p/ ^u,De %>} + ${p' e p2 \ R p , \= u, D e <ly } 

< w h pl (u,D) + e(% 2 ,R p2 ,u,D) 
= h p (u,D), 

where the steps marked with (*) use the induction hypothesis. This is what we 
needed to show. 

(R5) For two paths p, q E A 1 and a role expression uj E ^c,r, if (p, <?) G uj 1 then g G pl*2 
or p G gl*2. 

Because of the syntactic restriction to safe role expressions in ACCQEb, for (p, q) G uj 1 
to hold there must be a role i? G NR^-j- such that (p, q) G -R x . By construction of 
R J , this can only be the case if q G pl*2 or p G gl*2. 

(R6) For two paths p,q G A x with g G pl*2 and a role uj, (p,q) G cj j iff R q \= uj and 
(g,p) G uj 1 iff lnv(R 9 ) |= c<j. 

For every R G NR CiT , (p, g) G i? J iff i? G R g holds as follows. For a (non-inverse) 
role R G NRcr fl NR, immediately by the construction of R 1 , (p, q) G R J iff R G R g . 
For an inverse role R = S' 1 with S G NR c ,r H NR, (p, g) G i2 z iff (q,p) G g_ J iff 
Inv(S') = R E R q . Hence, (p, q) E uj 1 iff R g |= uj. Similarly, for every R E NR C; r, 
(g,p) E R 1 iff lnv(i?) G R g , and hence (g,p) E uj 1 iff lnv(R 9 ) |= uj. 

Using these properties we can now show: 

Claim 4.37 For all p E A J and D E %, p E D x . 

The proof is by induction on the norm || • || of the concepts (as defined Definition 4.12). 
The base cases are D = A or D = ->A for a concept name A E NC. For D = A this is 
immediate by the definition of A 1 . For the case D = ->A, since $ p G \c(C,T), ->A E $ p 
implies and hence p E (-1A) 1 . For the induction step, we distinguish the different 

concept operators of AD^QUb. 

• If D = Ci n C 2 G <E»p then, since $ p G lc(C, T), also {Ci,C 2 } C $ p . Hence, by 
induction, p E Cf , p G and thus p G -D x . 

• The case D = C\ U C 2 is similar to the previous one. 

• Now assume D = (cxin uj E). For every q E A 1 , $ 9 G lc(C, T) and hence £ G $, 
iff ~E E" $ g . Since \\E\\ = \\~E\\ < \\D\\, by induction, q E E 1 iff E E $ 9 holds for 
every g G A x . 
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If p = e is the root of T then, by (R5) and (R6), 

Uq I (p, ?) e g g £ J } = Kg e pi*2 | R g h ^ £ e $J 

and hence, by (R4), 

min($ p ,cJ,E) < (}{g | (p, g) e uJ T ,q e E 1 } < max($ p ,u;,£). 

If p 7^ e, then p G {1,2}*2 is a "right successor". Let go be the unique path in 
{1, 2}*2 U {e} with p = gol fc 2, i.e., p's "predecessor" in X. 

|]{g | (p, g) G g G £ x } 
=tl{g G pl*2 | Rg |= cj, E G $ J + e($ 90 , lnv(R p ), £) 
=#{gGpl*2 | R q |=w,Ee $J + e($ 9()lfc , lnv(R p ), co, £). 

If E 1 ^ $ go or Inv(Rp) ty= a;, then e(<3>q ik, lnv(R p ), uj, E) = and 

min($ p , uj, E) = £ p (uj, E) 

<^{q\qepl*2,R q ^u,Ee^ q } 
= %{q\(p,q)eu I ,qeE 1 } 
< h p (u, E) = max($ p , uj, E) 

holds because of induction, (R4), (R5), and (R6). If E G $ 90 and lnv(R p ) |= uj, then 
e(& qol k, lnv(R p ),o;, E) = 1 and 

min(<I> p , uj, E) < £ p (uj, E) + 1 

< %{q(=pl*2 | R g |= uj, E G $J + 1 

= tt{?l (p,g)Gc/,gG£ x } 

< h p {uj, E) + 1 = max(<l> p , a;, i?) 

again holds by (R4), (R5), and (R6). 

If D — (>n uj E) then n < min($ p , uj, E) < jj{g | (p,q) G uj 1 , q G E 1 } and hence 
p G D 1 . If D = (<n uj E) then n > max($ p , uj, E) > (j{g | (p, q) G uj 1 , q G £ x } and 
hence p G D 1 . 

This finishes the proof of the claim, which yields the only-if direction of the lemma: if 
L(%c,t) 7^ then there exists a tree T G L(%c,t) and a corresponding interpretation X 
that satisfies the claim. Since C G $ e , e G C x and hence C x 7^ 0. Also, for every p G A J 
and every d C C 2 G T, NNF(^Ci U C 2 ) G $ p . Hence (-.Ci U C 2 ) x = A J and J |= T. 

For the ^/-direction, let C be satisfiable w.r.t. T and X = (A J , - x ) a model of T with 
C x 7^ 0. We construct a tree T from X that is accepted by %c,r- To this purpose, we 
define a function n : {1, 2}* — > A J U {*} and maintain an agenda of paths p G {1, 2} whose 
successors still need consideration. 

Let s G A 1 be an arbitrary element such that s G C 1 . Set 7r(e) = s and T(e) = 
($ s , NR CiT ) with $ s = {£> G clos(C,T) | s G U 1 }. Initialize the agenda with e. 

Pick the first element p G {1,2}* off the agenda. For s = ir(p), let $ s = {/} G 
clos(C, T) I s G D 2 } and let I C A 1 be a set such that 
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• X C {t G A 1 | (s, t) G i? x for some i? G NR c ,r } . 

• For every (>n uj D) G $ s there are ti,...t n G X with (s,t«) G cj x , i; G -D x for 
1 < i < n and ti ^ tj for 1 < i < j < n . 

• X is minimal w.r.t. set cardinality with these properties. 

Such a set X exists, is finite, possibly empty, and not necessarily uniquely defined. Let 
{ti, . . . , t n } be an enumeration of X. 

• For every 1 < i < n, we set 7r(pP _1 ) = * and T{pl % ~ r ) = (*, *). 

• For every 1 < i < n, we set 7i(pl t ^ 1 2) = tj and 

T( P r- 1 2) = (^,R ti ) 

where 

$ ti = {/J G clos(C,T) \t t eD x } 
R u = {R eW c ,T \ (s,^) e R 1 }. 

Put pl t ~ 1 2 at the end of the agenda. 

• Finally, for all p' G pl n {l, 2}* we define n(p') = * and T(p') = (*, *). 
Figure 4.9 illustrates this construction. 

Continuing this process until the agenda runs empty (or indefinitely if it never does) 
eventually defines T(p) for every p G {1,2}* (since the agenda is organised as a queue, 
every element will eventually be taken off the agenda). The proof that T G L(2lc,r) ( an d 
hence L(% c ,t) 7^ 0) is relatively simple and omitted here. ■ 



Theorem 4.38 

Satisfiability of ACCQJh-concepts w.r.t. general TBoxes is ExpTlME-compIete, even if num- 
bers in the input are represented in binary coding. 

Proof. ExpTlME-hardness was established in Lemma 4.31. By Lemma 4.36, generat- 
ing 2lc 5 r and testing L(% c ,r) for emptiness decides satisfiability of C w.r.t. T . Due to 
Lemma 4.35 and Fact 4.33 this can be done in time exponential in |C| + \T\. ■ 
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Now that we know how to deal with satisfiability of ACCQJb- concept w.r.t. TBoxes, we 
show how satisfiability of full knowledge bases can reduced to that problem using a pre- 
completion technique similar to the one in (Hollunder, 1996) for ^4CCQ-knowledge bases 
(see also Section 3.2.3). 

The definition of clos(-) is extended to ^UTQZ6-knowledge bases as follows. For a 
ACCQZb- knowledge base K = (T,A), we define clos(JC) as the smallest set X that sat- 
isfies the following properties: 

• for every x : D E A, NNF(£>) G X 

• for every d C C 2 E T, NNF(^d U C 2 ) G X 

• X is closed under sub-concepts and the application of ~. 
Again, §clos(K) = C( 1 7C| ) holds (compare Lemma 4.9). 
Definition 4.39 

Let JC = (T, A) be an ACCQIb-knowledge base. A knowledge base KJ = (T, A') is a 
pre-completion of /C, if 

1. there is a surjective function 

f : {x G Nl | x occurs in A} — > {x G Nl | x occurs in A'} 

such that 

• ifx:CeA then f(x) : C G A' 

• if(x,y) : Re A then (f(x),f(y)) : R G A' 

2. for every x that occurs in A' and every D G clos(K), x : D G A' or x : ~D G A' 

3. for every x that occurs in A', if x : C\ n C 2 G A' then x : C\ G A' and x : C 2 E A' 

4. for every x that occurs in A', if x : C\ U C 2 G A' then x : Ci E A' or x : C 2 E A' 

5. for every two distinct x, y that occur in A', x^y E A' 



A knowledge base KJ that satisfies 2-5 is called pre-completed. 



o 
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It is easy to see that a knowledge base is satisfiable iff it has as pre-completion that has 
a model that exactly satisfies the role assertions: 

Lemma 4.40 

Let JC = (T,A) be an ACCQJb-knowledge base and NR^ the set of roles that occur in JC 
together with their inverse. JC is satisfiable iff there exists pre-completion JC' = (T, A) of 
JC and a model X of JC' such that, for every x, y G Nl that occur in A', 

{R e NR/c I (x,y) : R e A} = {R e NR C I {x,y) e R 1 }. 

For a pre-completion JC' = (T,A), the existence of such a model can be reduced to 
concept satisfiability w.r.t. T. For an individual x that occurs in A, we define C x by 

C x = \~]{A | A e NC, x : A e A} n 
\~]{^A \AeNC,x-.^AeA}n 

\~\{(>(n - m) u D) | x : (>n u D) e A ,m = §ou A '(x, D)} n 
r|{(<(n-m) D) | x : (<n u D) G ^',m = p A '{x,D)}. 

Lemma 4.41 

Let JC' = (T, A) be a pre-completed ACCQZb-knowledge base. Kf has a model that satisfies, 

{R g MRk. I (x,y) : R G A'} = {R e NR^ I (z,y) e ^j, 

for every x,y G Nl tiat occur in A iff, for every x that occurs in A, the concept C x is 
satisfiable w.r.t. T . 

The proof of this lemma is straightforward and omitted here. 

Putting together Lemma 4.40 and Lemma 4.41, we have the steps of a reduction from 
knowledge-base satisfiability to concept satisfiability w.r.t. general TBoxes — a problem that 
we know how to solve in ExpTime (Theorem 4.38). But how do we obtain an ExpTime- 
algorithm from these lemmas? Lemma 4.40 involves a non-deterministic step since it talks 
about the existence of a completion. Since it is generally assumed that ExpTime ^ 
NExpTime we have to show how to search for such a completion in exponential time. 

Theorem 4.42 

Knowledge base satishability and instance checking for ACCQLb are ExpTlME-compIete, 
even if numbers in the input are represented using binary coding. 

Proof. ExpTlME-hardness is immediate from Theorem 4.38. It remains to show that 
these problems can be decided in exponential time. 

Let fC = (T,A) be an ^GCQZ&-knowledge base, NR^: the set of roles that occur in K, 
together with their inverse, and clos(fC) defined as above. Let m = |/C|. Only ABoxes A 
with no more individuals than A are candidates for pre-completions because the mapping 
/ must be surjective. The number of individuals in A is bounded by m. 
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For an ABox A' with i < m individuals, concept assertions ranging over c/os(X), and 
role assertions ranging over NR^, there are at most 2 lXTn x 2 l x2m = 0(2 m '') different pos- 
sibilities, and each such ABox contains at most i x m + i 2 x 2m + i 2 = 0(m 3 ) assertions. 
For an ABox A' with i individuals there are at most i m = 0(2 m ) different possibilities of 
mapping the individuals from A (of which there are at most m many) into the % individ- 
uals of A'. Given a fixed A' and a fixed mapping /, testing whether the requirement of 
Definition 4.39 are satisfied can be done in polynomial time in m and hence certainly in 
time £>(2 m ). 

Summing up, it is possible to enumerate all potential pre-comp let ions of /C, generate all 
possible mappings /, and test whether all requirements from Definition 4.39 are satisfied 
in time bounded in 

m 

(o{2 m5 ) x C(2 m2 ) x 0(2 m )) = C(2 m6 ). 

i=i 

Due to Lemma 4.40 and Lemma 4.41, A is satisfiable iff this enumeration yields a pre- 
completion A' such that C x is satisfiable w.r.t. T for every x that occurs in A'. Since all 
candidate pre-completions A' from the enumeration contain at most 0(w?) assertions, this 
can be checked for in time exponential in m for every candidate pre-completion A'. This 
yields an overall decision procedure that runs in time exponentially bounded in m. 

Instance checking is at least as hard as concept satisfiability w.r.t. general TBoxes and 
not harder than knowledge base satisfiability, hence ExpTlME-completeness of instance 
checking for ACCQJJb is immediate from what we have just proved. ■ 
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Chapter 5 

Cardinality Restrictions and 
Nominals 

In this chapter, we study the complexity of the combination of the DLs ACCQ and ACCQE 
with a terminological formalism based on cardinality restrictions on concepts. Cardinality 
restrictions were first introduced by Baader, Buchheit, and Hollunder (1996) as a termi- 
nological formalism that is particularly useful for configuration applications. They allow 
to restrict the number of instances of a (possibly complex) concept C globally using ex- 
pressions of the form (> n C) or (< n C). In a configuration application, the cardinality 
restriction (> 100 Parts) can be used to limit the overall number of Parts by 100, the 
cardinality restrictions (> 1 PowerSource) and (< 1 PowerSource) together state that 
there must be exactly one PowerSource, etc. 

As it turns out, cardinality restrictions are closely connected to nominals, i.e., atomic 
concepts referring to single individuals of the domain. Nominals are studied both in the 
context of DLs (Borgida & Patel-Schneider, 1994; De Giacomo & Lenzerini, 1996) and of 
modal logics (Gargov & Goranko, 1993; Blackburn & Seligman, 1995; Areces et al., 2000). 

After introducing cardinality restrictions and nominals, we show that, in the presence of 
nominals, reasoning w.r.t. cardinality restrictions can be polynomially reduced to reasoning 
w.r.t. TBoxes. In general the latter is a simpler problem. This allows to determine the 
complexity of ACCQ with cardinality restrictions as ExpTlME-complete as a corollary of 
a result in (De Giacomo, 1995), if unary coding of numbers in the input is assumed. For 
binary coding, we will show that the problem becomes NExpTlME-hard. Of all logics 
studied in this thesis, ACCQ with number restrictions is the only logic for which it has been 
shown that the coding of numbers effects the complexity of the inference problems. 

For ACCQE with cardinality restrictions, we show that reasoning becomes NExpTime- 
hard and is NExpTlME-complete if unary coding of numbers is assumed. By the con- 
nection to reasoning with nominals, this implies that reasoning w.r.t. general TBoxes for 
ACCQE with nominals has the same complexity and we sharpen this result to pure concept 
satisfiability. 

Finally, we generalise the results for reasoning with ACCQE to ACCQEb, with little effort, 
and show that, for ACCQEB, i.e., ACCQEb without the restriction to safe role expressions, 
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concept satisfiability is NExpTlME-complete (this is also a simple corollary of the NEx- 
pTlME-completeness of Boolean Modal Logic (Lutz & Sattler, 2000)). 

5.1 Syntax and Semantics 

Cardinality restrictions can be defined independently of a particular DL as long as it has 
standard extensional semantics. In this thesis, we will mainly study cardinality restrictions 
in combination with the DLs ACCQ and ACCQX- To make our considerations here easier, 
we assume that concepts are built using only the restricted set of concept constructors 
-i, (>n R C). Using de Morgan's laws and the duality of the at-least and at-most- 
restriction (see below Definition 4.2) the other constructors can be defined as abbreviations. 

Definition 5.1 (Cardinality Restrictions) 

A cardinality restriction is an expression of the form (<n C) or (>n C) where n£N and 
C is a concept. 

A CBox is a finite set of cardinality restrictions. 

An interpretation X satisfies a cardinality restriction (<n C) iff ^{C 1 ) < n, and it 
satisfies (>n C) iff§(C x ) > n. It satisfies a CBox C iff it satishes all cardinality restrictions 
in C; in this case, X is called a model of C and we will denote this fact hyX\=C. A CBox 
that has a model is called satisfiable. 

Since X |= (<0 ->C) iff C is satisfied by all elements of X, we will use (V C) as an 
abbreviation for the cardinality restriction (<0 ->C). o 

It is obvious that, for DLs that are closed under Boolean combinations of concepts, 
reasoning with cardinality restrictions is at least as hard as reasoning with TBoxes, as 
X |= C □ D iff X \= (<0 (C n -i-D)). As we will see, CBoxes can also be used to express 
ABoxes and even the stronger formalism of nominals. In this thesis, we have already 
encountered nominals in a restricted form, namely, as individuals that may occur in ABox 
assertions. DLs that allow for nominals allow those individuals to appear in arbitrary 
concept expressions, which, e.g., makes it possible to define the concept of parents of BOB by 
3has_child.B0B or the concept of BOB's siblings by -iBOB n 3has_child _1 .3has_child.B0B. 

Definition 5.2 (Nominals) 

Let Nl be a set of individual names or nominals. For an arbitrary DL C, its extension with 
nominals (usually denoted by CO) is obtained by, additionally, defining that every i e Nl 
is a concept. 

For the semantics, we require an interpretation X to map every i G Nl to a singleton 
set i 1 and extend the semantics of C to CO canonically. o 

Nominals in a DL makes ABoxes superfluous, since these can be captured using nomi- 
nals. Indeed, in the presence of nominals, it suffices to consider satisfiability of TBoxes as 
the "strongest" inference required. 
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Lemma 5.3 

For an arbitrary DL C, KB-satisfiability can be polynomially reduced to satisfiability of 
CO-TBoxes. 

Proof. Let /C = (T, A) be an ^-knowledge base, where the individuals in the ABox 
coincide with the individuals of CO. The ABox A is transformed into a TBox as follows. 
We define 

Xa = {i E C | i : C e A} U {i c | : R e A} u {i ^ \ i ^ j e A}. 

Claim 5.4 K, is satisfiable iff T U 7^4 is satisfiable. 

If K is satisfiable with I \= JC, it is easy to verify that X', which is obtained from I 
by setting i 1 = {i 1 } and preserving the interpretation of the concept and role names, is a 
model for T U T A . 

Conversely, any model J of T U T4 can be turned into a model T of /C by setting, for 
every individual i G Nl, i x = x for the unique x & i 1 and preserving the interpretation of 
concept and role names. ■ 

Now that we have seen how to get rid of ABoxes in the presence of nominals, we show 
how cardinality restrictions and nominals can emulate each other. 

Lemma 5.5 

For an arbitrary DL C, satisfiability of C-CBoxes and CO-TBoxes are mutually reducible. 
The reduction from CO to C is polynomial. The reduction from C to CO is polynomial if 
unary coding of numbers in the cardinality restrictions is assumed. 

Proof. It is obvious that the cardinality restrictions (<1 C) and (>1 C) enforce the 
interpretation of a concept name C to be a singleton, which can now serve as a substitute 
for a nominal. Also, an interpretation satisfies a general axiom C C D iff it satisfies 
(<0 (C n ->D)). In this manner, every nominal can be replaced by a concept and every 
general axiom by a cardinality restriction, which yields the reduction from reasoning with 
nominals and TBoxes to reasoning with cardinality restrictions. For the converse direction, 
the reduction works as follows. 

Let C = {(1X1 rii Ci), • • • (xife nk Ct)} be an £-CBox. W.l.o.g., we assume that C 
contains no cardinality restriction of the form (> C) because these are trivially satisfied 
by any interpretation. The translation of C, denoted by $(C), is the £(9-TBox defined by: 



$(C) = U{ $ 0*i niCi)\\<i< k}, 



where $(iXj rii Ci) is defined depending on whether txij 



< or iXj = >. 





if cxij = < 
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where o], . . . , o"' are fresh and distinct nominals and we use the convention that the empty 
disjunction is interpreted as _L to deal with the case n« = 0. 

Assuming unary coding of numbers, the translation of a CBox C is obviously computable 
in polynomial time. 

Claim 5.6 C is satisfiable iff $(C) is satisfiable. 

If C is satisfiable then there is a model X of C and X |= (cxij rii Ci) for each 1 < % < k. 
We show how to construct a model X' of <3>(C) from X. X' will be identical to X in every 
respect except for the interpretation of the nominals o\ (which do not appear in C). 

If txij=<, then X \= C implies %Cf < rii. If rii = 0, then we have not introduced 
new nominals, and $(C) contains Cj □ _L. Otherwise, we define (o^) J such that Cf C 
{(ol) 1 ' | 1 < j < rii}. This implies Cf' C (of) 1 ' U • • ■ U (o™ 1 ) 2 ' and hence, in either case, 
T h H< ni C t ). 

If cxij=>, then rii > must hold, and X \= C implies (jCf > rii. Let xi, . . . x Hi be 
distinct elements from A 1 with {xi, . . . ,x ni } C Cf. We set (o^) x = {xj}. Since we have 
chosen distinct individuals to interpret different nominals, we have X' |= o\ □ — >of for every 
1 < i < £ < n,i. Moreover, Xj G Cf implies X' |= o\ C Cj and hence X' \= $(> rij Cj). 

We have chosen distinct nominals for every cardinality restrictions, hence the previous 
construction is well-defined and, since X' satisfies $(cxij rii C?) for every i, X' |= 

For the converse direction, let X be a model of The fact that X \= C (and hence 

the satisfiability of C) can be shown as follows: let (cxij rii Cj) be an arbitrary cardinality 
restriction in C. If ixi i =< and rii = 0, then we have $(< Cj) = {Cj □ _L} and, 
since X |= $(C), we have Cf = and X |= (< Cj). If cxij=< and > 0, we have 
{d H o\ U • • • U of} C $(C). From X |= $(C) follows (JCf < (t(oJ U • • • U o"') X < n t . If 
cx3 i= >, then we have {o{ □ C t | 1 < j < rii} U {o^ □ -.of | 1 < j < £ < C $(C). 
From the first set of axioms we get {(o^) x | 1 < j < rii} Q Cf- From the second set 
of axioms we get that, for every 1 < j < £ < rii, {o\) x ^ {of) 1 . This implies that 
n i = %[J{(oi) I \l<j<n t }<%Cf. 

5.2 The Complexity of Cardinality Restrictions and 
Nominals 

We will now study the complexity of reasoning with cardinality restrictions both for ACCQ 
and ACCQX- Baader, Buchheit and Hollunder (1996) give an algorithm that decides sat- 
isfiability of CBoxes for ACCQ, but they do not give complexity results. Yet, it is easy to 
see that their algorithm runs in non-deterministic exponential time, which gives us a first 
upper bound for the complexity of the problem. For the lower bound, it is obvious that the 
problem is at least ExpTlME-hard, due to Lemma 5.5 and Theorem 3.18 Lemma 5.5 also 
yields ExpTime as an upper bound for the complexity of this problem using the following 
result established by De Giacomo (1995). 
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Fact 5.7 (De Giacomo, 1995, Section 7.3) 

Satisfiability and logical implication for CSfO knowledge bases (TBox and ABox) are 
ExpTlME-complete. 

The DL ChfO studied by the author is a strict extension of ACCQP. Unary coding of 
numbers is assumed throughout his thesis. Although the author imposes a unique name 
assumption, it is not inherent to the utilized reduction and must be explicitly enforced. It 
is thus possible to eliminate the formulas that require a unique interpretation of individuals 
from the reduction. Hence, according to Lemma 5.5, reasoning with cardinality restrictions 
for ACCQ can be reduced to CAfO, which yields: 

Corollary 5.8 

Consistency of ACCQD-CBoxes is ExpTlME-compIete if unary coding of number is as- 
sumed. 

For binary coding of numbers, the reduction used in the proof of Lemma 5.5 is no longer 
polynomial and, indeed, reasoning for ^IGCQ-CBoxes becomes at least NExpTlME-hard if 
binary coding is assumed (Corollary 5.20). 

5.2.1 Cardinality Restrictions and ACCQE 

The algorithm developed by Baader et. al. (1996) for ACCQ with number restrictions 
cannot easily be extended to ACCQE with cardinality restrictions. One indication for this is 
that the algorithm from (Baader et al., 1996) is a tableau algorithm that always constructs 
a finite model for a satisfiable CBox; yet, ACCQE with cardinality restriction no longer has 
the finite model property. The CBox 

(>1 -*A), (V (3R.T n (<1 R- 1 ) n VR.A)) 

is satisfiable, but does not have a finite model. The first cardinality restriction requires the 
existence of an instance x of ->A in the model. The second cardinality restriction requires 
every element of the model to have an /^-successor, so from x there starts an infinite path 
of i?-successors. This path must either run into a cycle or there must be infinitely many 
elements in the model. It cannot cycle back to x because this would conflict with the 
requirement that every element satisfies VR.A. It cannot cycle back to another element 
of the path because in that case, this element would have two incoming i?-edges, which 
conflicts with (<1 Rr 1 ). 

There exists no dedicated decision procedure for ACCQE with number restrictions, but 
it is easy to see that the problem can be solved by a reduction to C 2 , the two- variable 
fragment of FOL extended with counting quantifiers. Let L 2 denote the fragment of FOL 
that only has the variable symbols x and y. Then C 2 is the extension of L 2 that admits 
all counting quantifiers 3- m and 3- m for m > 1, rather than only 3. Gradel, Otto, and 
Rosen (1997) show that C 2 is decidable. Based on their decision procedure (Pacholski 
et al., 1997) determine the complexity of C 2 : 
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Figure 5.1 The translation from ACCQE into C 2 



.{A) 

: (CinC7 2 ) 
: (>n 

: (>n iT 1 C) 



* y (Cinc 2 ) 
V y (>nRC) 

\&(ix n C) 
*(C) 



= Ax for i6iV c 

= -*x(C) 

= tf x (Ci) A tf x (C 2 ) 

= 3^ n y.(Rxy A * W (C)) 

= 3^.(%A^(C)) 

= Ay for 4eiV c 

= * y (d) A tt„(C 2 ) 
= 3^y.(^A^(C)) 
= 3^.(^A^(C)) 

= a^x.tf^C) for ixi G {>, <} 

= f\{^(tx nC)\(tx nC)eC} 



Fact 5.9 (Pacholski et al., 1997) 

Satisfiability of C 2 is decidable in 2-NExpTime for binary coding of number and is 
NExpTlME-compIete for unary coding of numbers. 

Figure 5.1 shows how the standard translation of ACCQE into C 2 due to Borgida (1996) 
can be extended to cardinality restrictions. It is obviously a satisfiability preserving trans- 
lation, which yields: 

Lemma 5.10 

An- ACCQE CBox is satisfiable iff \&(C) is satishable. 

The translation from Figure 5.1 is obviously polynomial, and so we obtain, from 
Lemma 5.10 and Fact 5.9: 

Lemma 5.11 

Satisfiability of ACCQE -CBoxes can be decided in NExpTime, if unary coding of numbers 
in the input is assumed. 

We will see that, from the viewpoint of worst-case complexity, this is an optimal result, 
as the problem is also NExpTime hard. To prove this, we use a bounded version of the 
domino problem. Domino problems (Wang, 1963; Berger, 1966) have successfully been 
employed to establish undecidability and complexity results for various description and 
modal logics (Spaan, 1993a; Baader & Sattler, 1999). 

Domino Systems 
Definition 5.12 

For n G N, let Z n denote the set {0, . . . , n — 1} and @ n denote the addition modulo n. A 
domino system is a triple V = (D,H,V), where D is a finite set (of tiles) and if, V C D x D 
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are relations expressing horizontal and vertical compatibility constraints between the tiles. 
For s, t G N, let U (s, t) be the torus Z s x Z t , and let w = Wq . . . tu n -i be a word over D of 
length n (with n < s). We say that T> tiles U(s,t) with initial condition w iff there exists 
a mapping r : U(s,t) — > D such that, for all (x, y) G U (s, t), 

• if t(x, y) = d and r(x © s 1, y) = d! , then (d, d') G H (horizontal constraint); 

• if t(x, y) = d and t(x, y © t 1) = d', then (d, d') G V (vertical constraint); 

• r(i,0) — Wi for < i < n (initial condition). o 

Bounded domino systems are capable of expressing the computational behaviour of 
restricted, so-called simple, Turing Machines (TM). This restriction is non-essential in 
the following sense: Every language accepted in time T(n) and space S(n) by some one- 
tape TM is accepted within the same time and space bounds by a simple TM, as long as 
S(n),T(n) > 2n (Bbrger, Gradel, & Gurevich, 1997). 

Theorem 5.13 (Borger et al., 1997, Theorem 6.1.2) 

Let M be a simple TM with input alphabet S. Then there exists a domino system V = 
(D, H, V) and a linear time reduction which takes any input x G S* to a word w G D* 
with \x\ = \w\ such that 

• If M accepts x in time to with space Sq, then T> tiles U(s,t) with initial condition w 
for all s > s + 2, t > t Q + 2; 

• if M does not accept x, then V does not tile U (s, t) with initial condition w for any 
s,t> 2. 

Corollary 5.14 

There is a domino system T> such that the following is a NExpTlME-lwd problem: 

Given an initial condition w — w . . . u>„_i of length n. Does V tile the torus 
U(2 n+1 , 2 n+1 ) with initial condition w? 

Proof.Let M be a (w.l.o.g. simple) non-deterministic TM with time- (and hence space-) 
bound 2 n deciding an arbitrary NExpTlME-complete language C(M) over the alphabet 
E. Let T> be the according domino system and trans the reduction from Theorem 5.13. 

The function trans is a linear reduction from C(M) to the problem above: For v G E* 
with \v | = n, it holds that v G C(M) iff M accepts v in time and space 2^1 iff V tiles 
U(2 n+1 , 2 n+1 ) with initial condition trans{v). ■ 
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Defining a Torus of Exponential Size 

Similar to proving undecidability by reduction of unbounded domino problems, where 
defining infinite grids is the key problem, defining a torus of exponential size is the key to 
obtain a NExpTlME-completeness proof by reduction of bounded domino problems. 

To be able to apply Corollary 5.14 to CBox satisfiability for ACCQL, we must characterize 
the torus x Z2™ with a CBox of polynomial size. To characterize this torus, we use 
2n concepts X , . . . , X n _i and Y , . . . , Y n _\, where Xi (Yi) codes the ith bit of the binary 
representation of the X-coordinate (Y-coordinate) of an element a. 

For an interpretation X and an element a e A 1 , we define pos(a) by 

pos(a) := (xpos(a), ypos(a)) : 

[O, iiagX? 
Xi = < 

I 1, otherwise 

We use a well-known characterization of binary addition (see, e.g., Borger et al., 1997) 
to interrelate the positions of the elements in the torus: 

Lemma 5.15 

Let x, x' be natural numbers with binary representations 

n—l n—l 

x = ^2x i -2 i and x' = ^ A ■ 2\ 

i=0 i=0 

Then 

n—l k—l 

x' = x + l (mod 2 n ) iff /\(/\ X j = 1) -> (x k = 1 <-> x' k = 0) 

k=0 j=0 

n-1 k-1 

A/\(\/x ] =0)^(x k = x' k ), 

k=0 j=0 

where the empty conjunction and disjunction are interpreted as true and false, respectively 

The CBox C n is defined in Figure 5.2. The concept C( ,o) is satisfied by all elements 
a of the domain for which pos(a) = (0, 0) holds. C( 2 ™-i,2™-i) is a similar concept, whose 
instances a satisfy pos(a) = (2 n — 1, 2 n — 1). 

The concept D nort h is similar to D east where the role north has been substituted for 
east and variables Xj and have been swapped. The concept D east (D north ) enforces that, 
along the role east (north), the value of xpos (ypos) increases by one while the value of 
ypos (xpos) is unchanged. They are analogous to the formula in Lemma 5.15. 

The following lemma is a consequence of the definition of pos and Lemma 5.15. 



(n—l n—l \ 

J>,-2\ X>i-2M , where 
i— n i— n / 



=0 i=0 

-I 



0, if a <£ Yi 

1, otherwise . 
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Figure 5.2 A CBox defining a torus of exponential size 



C n = { (V 3east.T), (V 3north.T), 

(V (= 1 east' 1 T)), (V (= 1 north- 1 T)), 

1 C(0,0))j 1 C(2™-1,2™-1)), 

1 C(2 n -i,2' i -i)) ) (V D east n D north ) } 

n— 1 n— 1 

c ( o,o) = n- ,x * r, n- ,y * 



fc=0 fc=0 
n— 1 n— 1 



C(2"-l,2»-l) = ["I -^fe n |"~| Yfc 

fc=0 fc=0 
n-1 fc-1 

D east = \~\(\~\Xj) -> ((X fc -> Veast^X k ) n (-X fe -> VeastX k )) n 

fc=0 j=0 
n-1 fc-1 

|~|(|J^ X i) VeastX k ) n (-X fe -> Veas*.-.X fc )) n 

fc=0 j=0 
n-1 

|~|((n -> Veastn) n (-,y fc -> Veas*.-.Y fc )) 

fc=0 

n-1 fc-1 

^no^ = n(n^)-^((n-^V7iof^.-.y fc )n(-.y fc -^V7iof^.y fc ))n 

fc=0 j=0 
n-1 fc-1 

ri(U^) ^ (( y * ^ Vnorth.Y k ) n (-y -> Vnora-y)) n 

fc=o j=Q 

n-1 

-»• Vnorth.Xk) n (^X fe -> V 'north.^X k )) 



fc=0 



Lemma 5.16 

Let X = (A 1 , • x ) be an interpretation, D east , D north defined as in Figure 5.2, and a,b G A 1 . 

(a, b) G easf and a G D T east implies: xpos(b) = xpos(a) + 1 (mod 2 n ) 

yP os(6) = ypos(a) 

(a, 6) G north 1 and a G D^ orth implies: xpos(b) = xpos(a) 

ypos(b) = jpos(a) + 1 (mod 2 n ) 

The CBox C„ defines a torus of exponential size in the following sense: 
Lemma 5.17 

Let C n be the CBox as defined in Figure 5.2. Let X = (A 1 , - T ) be a model ofC n . Then 



(A 1 , east 1 , north 1 ) = (U(2 n , 2 n ), S u S 2 ) , 
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where U(2 n , 2 n ) is the torus Z 2 n x Z 2 ™ and Si, S 2 are the horizontal and vertical successor 
relations on this torus. 

Proof. We show that the function pos is an isomorphism from A x to U(2 n ) 2 n ). Injectivity 
of pos is shown by induction on the "Manhattan distance" d(a) of the pos-value of an 
element a to the pos-value of the upper right corner. 
For an element a G A x we define d(a) by 

d(a) = (2 n - 1 - xpos(a)) + (2™ - 1 - ypos(a)). 

Note that pos(a) = pos{b) implies d(a) = d{b). Since X |= 1 C( 2 n -i,2 n -i)), there is 
at most one element a G A x such that d(a) = 0. Hence, there is at most one element a 
such that pos(a) = (2 n — 1,2" — 1). Now assume there are elements a, b G A x such that 
pos(a) = pos{b) and d(a) = d{b) > 0. Then xpos(a) < 2™ — 1 or ypos(a) < 2 n — 1. W.l.o.g., 
we assume xpos(a) < 2™ — 1. From J |= C n , it follows that a, b G (zieast.T) 1 . Let a 1: bi be 
elements such that (a, ai) G easi 1 and (&, &i) G east 1 . From Lemma 5.16, it follows that 

xpos(ai) = xpos(6i) = xpos(a) + 1 (mod 2") 
ypos(a!) = ypos(6i) = ypos(a). 

This implies pos(a\) = pos(bi) and, since xpos(a) < 2 n — 1, it holds that xpos(ai) = 
xpos(6i) = xpos(a) + 1 > xpos(a). Hence, d(a\) = d{b\) < d(a) and the induction 
hypothesis is applicable, which yields a\ — b\. This also implies a = b because a\ G (= 
1 east^.T) 1 and {(a, ai), (6, ai)} C east 1 . Hence pos is injective. 

To prove that pos is also surjective we use a similar technique. This time, we use an 
induction on the distance from the lower left corner. For each element (x, y) G U(2 n ,2 n ), 
we define: 

d'(x,y) =x + y. 

We show by induction that, for each (x, y) G U(2 n ,2 n ), there is an element a G A 1 
such that pos(a) = (x,y). If d'(x,y) = 0, then x = y = 0. Since X |= 1 C( ,o)), there 
is an element a G A x such that pos(a) = (0,0). Now consider (x, y) G U(2 n ,2 n ) with 
d'(x,y) > 0. Without loss of generality we assume x > (if x = then y > must hold). 
Hence (x — l,y) G U(2 n ,2 n ) and d'(x — l,y) < d'(x,y). From the induction hypothesis, it 
follows that there is an element a G A x such that pos(a) = (x — l,y). Then there must 
be an element a\ such that (a, a x ) G easiF and Lemma 5.16 implies that pos(ai) = (x,y). 
Hence pos is also surjective. 

Finally, pos is indeed a homomorphism as an immediate consequence of Lemma 5.16. 

■ 

It is interesting to note that we need inverse roles only to guarantee that the function pos 
is injective. The same can be achieved by adding the cardinality restriction (^ (2 n ■ 2 n ) T) 
to C n , from which the injectivity of pos follows from its surjectivity and simple cardinality 
considerations. Of course, the size of this cardinality restriction is polynomial in n only 
if we assume binary coding of numbers. This has consequences for the complexity of 
.4/jCQ-CBoxes if binary coding of numbers in the input is assumed (see Corollary 5.20). 
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Also note that we have made explicit use of the special expressive power of cardinal- 
ity restrictions by stating that, in any model of C n , the extension of C^ n -i,2 n -i) must 
have at most one element. This cannot be expressed with a A£CQT-TBox consisting of 
terminological axioms. 

Reducing Domino Problems to CBox Satisfiability 

Once Lemma 5.17 has been proved, it is easy to reduce the bounded domino problem 
to CBox satisfiability. We use the standard reduction that has been applied in the DL 
context, e.g., by Baader and Sattler (1999). 

Lemma 5.18 

Let V = (D,V,H) be a domino system. Let w — w . . .ib„_i G D*. There is a CBox 
C(n,V,w) such that: 

• C(n, V, w) is satishable iff V tiles U (2 n , 2 n ) with initial condition w, and 

• C(n, V, w) can be computed in time polynomial in n. 

Proof. We define C(n,T>,w) := C n UCx>UC w , where C n is defined in Figure 5.2, Cx> captures 
the vertical and horizontal compatibility constraints of the domino system V, and C w 
enforces the initial condition. We use an atomic concept C d for each tile d E D. C v 
consists of the following cardinality restrictions: 

(vLjcy, (v n n <c d nc*)\ 

deD d&D d'eD\{d} 

(V r~|(Cd-(Veasfc □ C>))), (V |~| (C d - {^north. [_\ C*))). 
deD (d,d')eH deD (d,d')ev 

C w consists of the cardinality restrictions 

(V (C( ,o) — > C Wo )), . . . , (V (C(„-i,o) -> CW-J, 

where, for each x, y, C( X)J ,) is a concept that is satisfied by an element a iff pos(a) = (x,y), 
defined similarly to C( ,o) and C^ n -i,2 n -i)- 

From the definition of C(n,V,w) and Lemma 5.17, it follows that each model of 
C(n,T>,w) immediately induces a tiling of U(2 n ,2 n ) and vice versa. Also, for a fixed 
domino system V, C(n,V,w) is obviously polynomially computable. ■ 

The main result of this section is now an immediate consequence of Lemma 5.11, Lem- 
ma 5.18, and Corollary 5.14: 

Theorem 5.19 

Satisfiability of ACCQL-CBoxes is NExpTlME-iard. It is NExpTlME-compIete if unary 
coding of numbers is used in the input. 
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Recalling the note below the proof of Lemma 5.17, we see that the same reduction also 
applies to ACCQ if we allow binary coding of numbers. 

Corollary 5.20 

Satisfiability of ACCQ-CBoxes is NExpTlME-hard if binary coding is used to represent 
numbers in cardinality restrictions. 

It should be noted that it is open whether the problem can be decided in NExpTime, 
if binary coding of numbers is used. In fact, the reduction to C 2 only yields decidability 
in 2-NExpTime if binary coding is assumed. 

We have already seen that, for unary coding of numbers, deciding satisfiability of ACCQ- 
CBoxes can be done in ExpTime (Corollary 5.8). This shows that the coding of numbers 
indeed has an influence on the complexity of the reasoning problem. For the problem of 
concept satisfiability in ACCQ this is not the case; in Chapter 4 we have shown that the 
complexity of the problem does not rise when going from unary to binary coding. 

For unary coding, we needed both inverse roles and cardinality restrictions for the 
reduction. This is consistent with the fact that satisfiability for ACCQZ concepts with 
respect to TBoxes that consist of terminological axioms is still in ExpTime. This can 
be shown by a reduction to the ExpTlME-complete logics CXM (De Giacomo, 1995) or 
CPDL (Pratt, 1979). This shows that cardinality restrictions on concepts are an additional 
source of complexity. 

Using Lemma 5.5 it is now also possible to determine the complexity of reasoning with 
ACCQEO TBoxes: 

Corollary 5.21 

Satisfiability of ACCQEO -TBoxes is NExpTlME-iard. It is NExpTlME-compIete if unary 
coding of numbers in the input is assumed. 

Proof. Lemma 5.5 states that satisfiability of AjCCQZO-TBoxes and satisfiability of ACCQZ- 
CBoxes are mutually polynomially reducible problems. Hence, both the lower and the 
upper complexity bound follow from Theorem 5.19. ■ 

This result explains a gap in (De Giacomo, 1995). There the author establishes the 
complexity of satisfiability of knowledge bases consisting of TBoxes and ABoxes both 
for ChfO, which allows for qualifying number restrictions, and for CIO, which allows for 
inverse roles, by reduction to the ExpTlME-complete logic PDL. No results are given 
for the combination CZAfO, which is a strict extension of ACCQEO. Corollary 5.21 shows 
that, assuming ExpTime ^ NExpTime, there cannot be a polynomial reduction from 
the satisfiability problem of CZAfO knowledge bases to PDL. A possible explanation for 
this leap in complexity is the loss of the tree model property, which has been proposed by 
Vardi (1996) and Gradel (1999c) as an explanation for good algorithmic properties of a 
logic. While, for CIO and CAfO, satisfiability is decided by searching for tree-like pseudo- 
models even in the presence of nominals, this seems no longer to be possible in the case of 
knowledge bases for CZAfO. 
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Unique Name Assumption 

It should be noted that our definition of nominals is non-standard for DLs in the sense 
that we do not impose the unique name assumption that is widely made, i.e., for any two 
individual names o±, 02 G Nl, of ^ of is required. Even without a unique name assumption, 
it is possible to enforce distinct interpretation of nominals by adding axioms of the form 
0\ E _, o 2 , which we have already used in the proof of Lemma 5.3. Moreover, imposing a 
unique name assumption in the presence of inverse roles and number restriction leads to 
peculiar effects. Consider the following TBox: 

T = {o E (<k R T), T E 3iT 1 .o} 

Under the unique name assumption, T is satisfiable iff Nl contains at most k individual 
names, because each individual name must be interpreted by a unique element of the 
domain, every element of the domain must be reachable from o x via the role R, and o 1 
may have at most k .R-successors. We believe that this dependency of the satisfiability of a 
TBox on constraints that are not explicit in the TBox is counter-intuitive and hence have 
not imposed the unique name assumption. 

Nevertheless, it is possible to obtain a tight complexity bound for satisfiability of 
A&QBD -T~Box.es with unique name assumption without using Lemma 5.5, but by an im- 
mediate adaptation of the proof of Theorem 5.19. 

Corollary 5.22 

Satisfiability of ACCQEO-TBoxes with the unique name assumption is NExpTlME-hard. 
It is NExpTlME-compIete if unary coding of numbers in the input is assumed. 

Proof. A simple inspection of the reduction used to prove Theorem 5.19, and especially 
of the proof of Lemma 5.17 shows that only a single nominal, which marks the upper right 
corner of the torus, is sufficient to perform the reduction. If o is an individual name and 
create is a role name, then the following TBox defines a torus of exponential size: 

T n = { T E 3 east T, 

T □ (= 1 east" 1 T), 
T E 3 create. C( ,o), 

C(2™-1,2™-1) E O, 

Since this reduction uses only a single individual name, the unique name assumption is 
irrelevant in this case. ■ 



T E Bnorth.T, 

T E (= 1 north^ 1 T), 

T E Deast n D north, 
O E C(2"-l,2"-l) } 



Internalization of Axioms 

In the presence of inverse roles and nominals, it is possible to internalise general inclu- 
sion axioms into concepts (Baader, 1991; Schild, 1991; Baader, Biirckert, Nebel, Nutt, & 
Smolka, 1993) using the spy-point technique used, e.g., by Blackburn and Seligman (1995) 
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and Areces, Blackburn, and Marx (1999). The main idea of this technique is to enforce that 
all elements in the model of a concept are reachable in a single step from a distinguished 
point (the spy-point) marked by an individual name. 

Definition 5.23 

Let X be an ACCQHD-TBox. W.l.o.g., we assume that X contains only a single axioms 
T C D. Let spy denote a fresh role name and i a fresh individual name. We define 
the function - spy inductively on the structure of concepts by setting A spy = A for all 
A G NC, o spy = o for all o G Nl, (^C) spy = -C s ™, (d n C 2 ) spy = C spy n C s 2 py , and 
(>n R C) spy = (>n R (3spy~ 1 .i) n C spy ). 

The internalization C? of X is defined as follows: 

C T = in D spy n Vspy.D spy 

Lemma 5.24 

Let X be an ACCQLO-TBox. X is satisfiable iff C r is satisfiable. 

Proof. For the indirection let X be a model of Cr with a G (Cr) 1 - This implies i 1 = {a}. 
Let T be defined by 

A 1 ' = {a} U {x G A 1 | (a, x) G spy 1 } 

and - 1 ' = ■ x \ A z>. 

Claim 5.25 For every x G A 1 ' and every A/XQTO-concept C, we have x G (C spy ) J iff 
x G C 1 '. 

We proof this claim by induction on the structure of C. The only interesting case is 
C = (>n R D). In this case C spy = (>n R (3spy- 1 .i) n D s ™). We have 

xe(>nR (3spy- 1 .i) n D spy f 
iff ${y G A 1 | (x,y) G R 1 and y G (Espy' 1 . if n (D 3 ™) 1 } ^ n 
(*) iff G A 1 ' | (x, j/) G and y G -D J '} ^ n 
iff x G {>nRD) x \ 

where the equivalence (*) holds because, if y G (Bspy -1 .-?) 1 n (D spy ) J then y G A 2 ' and 
y G -D x by induction. Also, if y G A 1 , then (x, y) G i? 1 iff (x, y) G i? J and hence the sets 
{y e A 1 \ (x,y) G R x and y G (Bsp?/- 1 .?) 2 n (D spy ) J } and G A 1 ' \ (x,y) G and y G 
-D x } are equal. 

By construction, for every x G A x , x G (D^) 1 . Due to Claim 5.25, this implies 
x G -D J ' and hence X' |= T C D. 

For the on/y-z/-direction, let X be an interpretation with X |= X. We pick an arbitrary 
element a G A 1 and define an extension X' of X by setting i 1 = {a} and spy 1 = {(a, x) | 
x G A x }. Since i and spy do not occur in X, we still have that X' |= X. 

CLAIM 5.26 For every x G A 1 ' and every ^GCQTO-concept C that does not contain % or 

spy, x G C 1 ' iff x G (C STO ) X \ 
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Again, this claim is proved by induction on the structure of concepts and the only 
interesting case is C = (>n R E). 

x E{>nR E) 1 ' 
iff ${y E A 1 ' | (x, y) E R 1 ' and y E E 1 '} ^ n 
(*) iff %{y E A 1 ' | (x,y) E R 1 ' , (a,y) E spy 1 ', and y E {E s ™) x '} > n 
iff x E (>n R (3spy- 1 .i) n E spy f . 

The equivalence (*) holds because, by construction of X', (a, y) E spy 1 ' holds for every 
element y of the domain and y E E 1 ' iff y E {E spy ) 1 ' holds by induction. 

Since, T \= T C D, Claim 5.26 yields that (D 3 ™) 1 ' = A 1 ' and hence a E {C T f ■ 

As a consequence, we obtain the sharper result that already pure concept satisfiability 
for ACCQDD is a NExpTlME-complete problem. 

Corollary 5.27 

Concept satisfiability for ACCQEO is NExpTime-IwcL It is NExpTlME-complete if unary 
coding of numbers in the input is assumed. 

Proof. From Lemma 5.24, we get that the function mapping a ACCQTO-TBox T to C? is a 
reduction from satisfiability of ACCQDD '-TBoxes to satisfiability of ACCQDD concepts. From 
Corollary 5.21 we know that the former problem is NExpTlME-complete. Obviously, 
CY can be computed from T in polynomial time. Hence, the lower complexity bound 
transfers. The NExpTime upper bound is a consequence of Corollary 5.21 and the fact 
that an ACCQDD concept C is satisfiable iff, for an individual j that does not occur in C, 
the TBox {j CC} is satisfiable. ■ 



5.2.2 Boolean Role Expressions 

In Chapter 4, we have studied the DL ACCQEb, which allowed for a restricted — so called 
safe — form of Boolean combination of roles, and for which concept satisfiability is decidable 
in polynomial space. It is easy to see that the results established for ACCQD in this chapter 
all transfer to ACCQEb and we state them here as (indeed, trivial) corollaries. 

We have already argued that the restriction to safe role expressions is necessary to 
obtain a DL for which satisfiability is still decidable in polynomial space: the concept 
(<0 (RU-iR) -iC) is satisfiable iff C is globally satisfiable, which is an ExpTlME-complete 
problem (see, Theorem 3.18). Indeed, as a corollary of Theorem 5.19, it can be shown that 
concept satisfiability becomes a NExpTlME-hard in the presence of arbitrary Boolean 
operations on roles. 
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Definition 5.28 

The DL ACCQEB is defined as ACCQEb with the exception that arbitrary role expressions 
are allowed. The DL ACCQB is the restriction of ACCQEB that disallows inverse roles. The 
semantics of ACCQEB and ACCQB are define as for ACCQEb. o 



Decidability of concept and CBox satisfiability for ACCQEb, ACCQB, and ACCQEB in 
NExpTime can easily be shown by extending the embedding ^> x into C 2 from Figure 5.1 
to deal with Boolean combination of roles. 



Lemma 5.29 

Satisfiability of ACCQEB-concepts and ACCQEB-CBoxes is polynomially reducible to C 2 - 
satishability. 

Proof. For a role expression uj, we define *& xy (u>) inductively by 



V xy (R) = Rxy 
^xyiR' 1 ) = Ryx 

tf^wi n u 2 ) = v xy M a ^ xy {uj 2 ) 

Vxyiu! UU0 2 ) = ^xyM V ^(^2) 



and set ^(mbwC) = 3^ n x.^ xy (u) A * y (C). 

This translation is obviously polynomial and satisfies, for every interpretation X and 
concept C, 

C x = {ae A x | IhW(a)}. 

Hence, a concept C is satisfiable iff 3- 1 x.^ x (C) is satisfiable. CBoxes can be reduced to 
C 2 as shown in Figure 5.1. This yields the desired reductions. ■ 



Since ACCQE is a subset of ACCQEb, which, in turn, is a subset of ACCQEB, the following 
is a simple corollary of Theorem 5.19 and Theorem 5.29: 

Corollary 5.30 

Satisfiability of ACCQEb- or ACCQEB-CBoxes is NExpTlME-hard. The problems are NExp- 
TlME-complete if unary coding of numbers in the input is assumed. 

Proof. The lower bound is immediate from Corollary 5.19 because the set of ACCQE- 
concepts is strictly included in the set of ACCQEb- and ^GCQZB-concepts. In the case of 
unary coding of numbers in the input, the upper bound follows from Lemma 5.29 and 
Fact 5.9. ■ 
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Similarly, the results for reasoning with nominals also transfer from Corollary 5.21. 
Corollary 5.31 

Satisfiability of ACCQEtO- and ACCQEBO -concepts is NExpTlME-hard. The problems are 
NExpTlME-compIete if unary coding of numbers in the input is assumed. 

So, in the presence of cardinality restrictions or nominals, reasoning with ACCQD8 is not 
harder than reasoning with ACCQEb. Without cardinality restrictions or nominals, though, 
reasoning with ACCQEb is less complex (ExpTlME-complete, Theorem 4.42) than reasoning 
for ACCQEB. The reason for this is that ACCQEB can easily mimic cardinality restrictions 
(and hence nominals) using a fresh role: 

Lemma 5.32 

CBox satisfiability for ACCQ and ACCQE is polynomially reducible to concept satisfiability 
ofACCQB and ACCQEB respectively. 

Proof. Let C be a ACCQ(T)B-CBox and R a role that does not occur in C. We transform 
C into a ACCQ{X)B concept Cq by setting 

k 

C c = (<0 ^R T) n P|K rn R d). 

i=i 

Claim 5.33 Cc is satisfiable iff C is satisfiable. 

Let X be a model for C . We define a model X' of Cc by setting R 1 ' := A 1 x A 1 and 
preserving the interpretation of all other names. Since R does not occur in C, Cf = Cf 
holds for every i. Since R is interpreted by the universal relation, (<0 —>R T) 1 = A 1 
holds. Also, again since R x is the universal relation, for every x G A 1 , {y \ (x, y) G 
R 1 ' and y eC I '} = C 1 ' . Thus, if 1 \= m d), then (m m R df = A 1 '. Hence, from 
X |= C is follows that Cq = A 1 , which proves its satisfiability. 

For the converse direction, if Cc is satisfiable with x G Cq for an interpretation X, 
then, since x G (<0 -1R4 T) J , {y \ (x,y) G R 2 } = A x must hold and hence {y \ (x,y) G 
R 1 and y G C 1 } = C x . It immediately follows that 1 \= C. 

Obviously, the size of Cc is linear in the size of C, which proves this lemma. ■ 

Corollary 5.34 

Concept satisfiability for ACCQ3 and ACCQEB is NExpTlME-iard. The problems are 
NExpTlME-compIete if unary coding of numbers in the input is assumed. 

Proof. Concept satisfiability for ACCQEB is NExpTlME-hard by Lemma 5.32 and Theo- 
rem 5.19, it can be decided in NExpTime by Lemma 5.29 and Fact 5.9. 

For ACCQB the situation is slightly more complicated because Lemma 5.32 yields NEx- 
pTlME-hardness only for binary coding of numbers. Yet, Lutz and Sattler (2000) show 
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that concept satisfiability even for ACTS, i.e., ACC extended with Boolean role expressions, 
is NExpTlME-hard, which yields the lower bound also for the case of unary coding of 
numbers. The matching upper bound (in the case of unary coding) again follows from 
Lemma 5.29 and Fact 5.9. ■ 

Of course, (Lutz & Sattler, 2000) yields the lower bound also for ACCQTB. Since the 
connection between reasoning with cardinality restrictions and full Boolean role expression 
established in Lemma 5.32 is interesting in itself and yields, as a simple corollary, the result 
for ACCQEB, we include this alternative proof of this fact in this thesis. 



Chapter 6 

Transitive Roles and Role Hierarchies 



This chapter explores reasoning with Description Logics that allow for transitive roles. 
Transitive roles play an important role in knowledge representation because, as argued 
by Sattler (2000), transitive roles in combination with role hierarchies are adequate to 
represent aggregated objects, which occur in many application areas of knowledge repre- 
sentation, like configuration (Wache & Kamp, 1996; Sattler, 1996b; McGuinness & Wright, 
1998), ontologies (Rector & Horrocks, 1997), or data modelling (Calvanese, Lenzerini, & 
Nardi, 1998; Calvanese, De Giacomo, Lenzerini, Nardi, & Rosati, 1998; Calvanese, De Gi- 
acomo, & Rosati, 1999; Franconi, Baader, Sattler, & Vassiliadis, 2000). 

Baader (1991) and Schild (1991) were the first to study the transitive closure of roles in 
DLs that extend ACC, and they both developed DLs that are notational variants of PDL 
(Fischer & Ladner, 1979). Due to the expressive power of the transitive closure, these 
logics allow for the internalisation of terminological axioms (Baader, 1991; Schild, 1991; 
Baader et al., 1993) and hence reasoning for these logics is at least ExpTlME-hard. Sattler 
(1996a) studies a number of DLs with transitive constructs and identifies the DL S, 1 i.e., 
ACC extended with transitive roles, as an extension of ACC that still permits a P Space 
reasoning procedure. 

Horrocks and Sattler (1998) study SI, the extension of S with inverse roles, and develop 
a tableau based reasoning procedure. While they conjecture that concept satisfiability and 
subsumption can be decided for SI in PSpace, their algorithm only yields an NExpTime 
upper bound. We verify their conjecture by refining their tableau algorithm so that it 
decides concept satisfiability (and hence subsumption) in PSpace. A comparable approach 
is used by Spaan (1993b) to show that satisfiability of the modal logic K4 t — corresponding 
to SI with only a single, transitive role — can be decided in PSpace. 

Subsequently SI is extended with role hierarchies (Horrocks & Gough, 1997) and qual- 
ifying number restrictions, which yields the DL ShdQ. The expressive power of ShdQ is 
particularly well suited to capture many properties of aggregated objects (Sattler, 2000) 
and has applications in the area of conceptual data models (Calvanese, Lenzerini, & Nardi, 
1994; Franconi & Ng, 2000) and query optimization (Horrocks, Sattler, Tessaris, & Tobies, 

Previously, this logic has been called A£T R + . Here, we use S instead because of a vague correspondence 
of AO0 R + with the modal logic S4. 
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2000). Furthermore, there exists the OIL approach (Fensel, Horrocks, van Harmelen, 
Decker, Erdmann, & Klein, 2000) to add SHZQ-b&sed inference capabilities to the seman- 
tic web (Berners-Lee, 1999). These applications have only become feasible due to the 
availability of the highly optimized reasoner iFaCT (Horrocks, 1999) for SHEQ. 

We determine the worst-case complexity of reasoning with SHEQ as ExpTlME-com- 
plete, even if binary coding of numbers in the number restrictions is used. This result 
relies on a reduction of SHEQ to ACCQEb with TBoxes, a problem we already know how to 
solve in ExpTime (Theorem 4.38). Using the same reduction we show that reasoning for 
SHEQP, i.e., the extension of SHEQ with nominals, is NExpTlME-complete (in the case of 
unary coding of numbers). 

As the upper ExpTlME-bound for SHEQ relies on a highly inefficient automata con- 
struction, Section 6.3 extends the tableau algorithm for SHXF (Horrocks & Sattler, 1999) 
to deal with full qualifying number restrictions. While this algorithm does not meet the 
worst-case complexity of the problem (a naive implementation of the tableau algorithm 
would run in 2-NExpTime), it is amenable to optimizations and forms the basis of the 
highly optimised DL system iFaCT (Horrocks, 1999). See Section 3.1 for a discussion of 
the different reasoning paradigms and issues of practicability of algorithms. 

6.1 Transitive and Inverse Roles: SL 

In this section we study the complexity of reasoning with the DL SI, an extension of the 
DL ACC with transitive roles and inverse roles: 

Definition 6.1 (Syntax and Semantics of SI) 

Let NC be a set of atomic concept names, NR a set of atomic role names, and NR + C NR 
a set of transitive role names. With NR := NR U | R G NR} we denote the set of 

iST-roles. The set of ST-concepts is built inductively from NC and NR using the following 
grammar, where A G NC and R G NR: 

C ::= A | -^C | d n C 2 | Ci U C 2 | Vi?.C | 3R.C. 

The semantics of SE-concepts is defined similarly to the semantics of ACC -concepts w.r.t. 
an interpretation X, where, for an inverse role R^ 1 G NR, we set (R^ 1 ) 2 = {(y, x) \ 
(x,y) G R 1 }. Moreover, we only consider those interpretations that interpret transitive 
roles Re NR + by transitive relations. An SE-concept C is satisfiable iff there exists an 
interpretation X such that, for every R G NR + , R x is transitive, and C x ^ 0. Subsumption 
is defined as usual, again with the restriction to interpretations that interpret transitive 
roles with transitive relations. 

With S we denote the fragment of SX that does not contain any inverse roles. o 

In order to make the following considerations easier, we introduce two functions on 
roles: 
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1. The inverse relation on roles is symmetric, and to avoid considering roles such as 

we define a function Inv which returns the inverse of a role. More precisely, 
lnv(i?) = R^ 1 if R is a role name, and lnv(i?) = S if R — S^ 1 . 

2. Obviously, the interpretation R x of a role R is transitive if and only if the interpreta- 
tion of lnv(i?) is transitive. However, this may be required by either R or lnv(i?) being 
in NR + . We therefore define a function Trans, which is true iff R must be interpreted 
with a transitive relation — regardless of whether it is a role name or the inverse of a 
role name. More precisely, Trans(-R) = true iff R E NR + or lnv(i?) e NR + . 

6.1.1 The ^-algorithm 

We will now describe a tableau algorithm that decides satisfiability of ST-concepts in 
P Space, thus proving PSPACE-completeness of iST-satisfiability. Like other tableau algo- 
rithms, the ST-algorithm tries to prove the satisfiability of a concept C by constructing a 
model for C. The model is represented by a so-called completion tree, a tree some of whose 
nodes correspond to individuals in the model, each node being labelled with two sets of 
ST-concepts. When testing the satisfiability of an ST-concept C, these sets are restricted 
to subsets of sub(C), where sub{C) is the set of subconcepts of C, which is defined in the 
obvious way. Before we formally present the algorithm, we first discuss some problems that 
need to be overcome when trying to develop an ST-algorithm that can be implemented to 
run in PSPACE. 

Dealing with transitive roles in tableau algorithms requires extra considerations because 
transitivity of a role is, generally speaking, a global constraint whereas the expansion rules 
and clash conditions of the tableau algorithms that we have studied so far are of a more 
local nature. They only take into account a single node of the constraint system or at 
most a node and its direct neighbours. Also, many of our previous considerations relied 
on the fact that satisfiable concepts have tree models, which, in the presence of transitive 
roles is no longer the case. To circumvent these problems, we use the solution that has 
already been used, e.g., by Halpern and Moses (1992) to deal with the modal logic S4, 
which possesses a reflexive and transitive accessibility relation. Instead of directly dealing 
with models and transitive relations, we use abstractions of models — so called tableaux- 
thai disregard transitivity of roles and have the form of a tree. This is done in a way that 
allows to recover a model of the input concept by transitively closing those role relations 
that are explicitly asserted in the tableau. To prove satisfiability of the input concept, the 
iST-algorithm now tries to build a tableau instead of trying to construct a model. Apart 
from this difference, the ST-algorithm is very similar to the tableau algorithms we have 
encountered so far: starting from an initial constraint system it employs completion rules 
until the constraint system is complete, in which case the existence of a tableau is evident, 
or until an obvious contradiction indicates an unsuccessful run of the (non-deterministic) 
algorithm. 

While it would be possible to maintain the use of ABoxes to capture the constraint 
system that we will encounter during our discussion of DLs with transitive roles, it is much 
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more convenient to emphasise the view of constraint systems as node- and edge-labelled 
trees, so this view will prevail in the remainder of this chapter. 



6.1.2 Blocking 

Sattler (1996a) shows that concept satisfiability for S can be determined in polynomial 
space using an adaptation of the techniques employed by Halpern and Moses (1992) to 
decide satisfiability for the modal logic S4. To understand why these techniques cannot be 
extended easily to deal with inverse roles, as we have done in Chapter 4 when generalizing 
from ACCQ to ACCQfib, we have to discuss the role of blocking. 

The key difference between the algorithms from the previous chapters and the SL- 
algorithm lies in the way universal restrictions are propagated through the constraint 
system: whenever VR.C with Trans(i?) appears in the label of a node x and x has an 
i?-neighbour y, then not only C is asserted for y, but also Vi?.C. This makes sure that 
C is successively asserted for every node reachable from x via a chain of i?-edges. These 
are exactly the nodes that are reachable from x with a single i?-step once R has been 
transitively closed; exactly these nodes must satisfy C in order for WR.C to hold at x. 

Previously the termination of the tableau algorithms relied on the fact that the nesting 
of universal and existential restrictions strictly decreases along a path in the tableau. When 
dealing with transitive roles in the described manner, this is no longer guaranteed. For 
example, consider a node x labelled {C,3R.C,\/R.(3R.C)}, where R is a transitive role. 
The described approach would cause the new node y that is created due for the 3R.C 
constraint to receive a label identical to the label of x. Thus, the expansion process could 
be repeated indefinitely. 

The way we deal with this problem is by blocking: halting the expansion process when 
a cycle is detected (Baader, 1991; Buchheit et al., 1993; Halpern & Moses, 1992; Sattler, 
1996a; Baader et al., 1996; Horrocks & Sattler, 1999). For logics without inverse roles, the 
general procedure is to check the constraints asserted for each new node y, and if they are a 
subset of the constraints for an existing node x, then no further expansion of y is performed: 
x is said to block y. The resulting constraint system corresponds to a cyclic model in which 
y is identified with x. 2 The validity of the cyclic model is an easy consequence of the fact 
that each 3R.D constraint for y must also be satisfied by x because the constraints for x 
are a superset of the constraints for y. Termination is now guaranteed by the fact that 
all constraints for individuals in the constraint system are ultimately derived from the 
decomposition of the input concept C, so every set of constraints for an individual must 
be a subset of the subconcepts of C, and a blocking situation must therefore occur within 
a finite number of expansion steps. 



2 For logics with a transitive closure operator it is necessary to check the validity of the cyclic model 
created by blocking (Baader, 1991), but for logics that only support transitive roles the cyclic model is 
always valid (Sattler, 1996a). 



6. 1 Transitive and Inverse Roles: SL 



97 



Dynamic Blocking 

Blocking is more problematic when inverse roles are added to the logic, and a key feature of 
the algorithms presented in (Horrocks & Sattler, 1999) was the introduction of a dynamic 
blocking strategy. It uses label equality instead of the subset condition, and it allows blocks 
to be established, broken, and re-established. 

Label inclusion as a blocking criterion is no longer correct in the presence of inverse 
roles because roles are now bi-directional, and hence universal restrictions at the blocking 
node can conflict with the constraints for the predecessor of the blocked node. 

Taking the above example of a node labelled {C, 3R.C, VR.(3R.C)}, if the successor 
of this node were blocked by a node whose label additionally included Vi? _1 .-iC, then the 
cyclic model would clearly be invalid. This is shown in Figure 6.1, where x blocks its 
.R-successor y (if subset-blocking is assumed) and hence in the induced model (shown on 
the right), there exists an i?-cycle from x to x. Hence, C and Vi? _1 .-iC, which have both 
been asserted for x, now stand in a conflict. 

Figure 6.1 An invalid cyclic model 

*• C7,3i?.C,Vi?.(3i?.C),Vi?- 1 .-C7 x £+ C,3R.C^R.{3R.C)^R~ X .-.C 

R 

x blocks y '■ R 

'•. \ 

V C,3R.C,VR.(3R.C) 



In (Horrocks & Sattler, 1999), this problem was overcome by allowing a node x to 
be blocked by one of its ancestors if and only if they were labelled with the same set of 
concepts. 

Another difficulty introduced by inverse roles is the fact that it is no longer possible 
to establish a block on a once-and-for-all basis when a new node is added to the tree. 
This is because further expansion in other parts of the tree could lead to the labels of the 
blocking and/or blocked node being extended and the block being invalidated. Consider 
the example sketched in Figure 6.2, which shows parts of a tableau that was generated for 
the concept 

A n 3S.{3R.T n 3P.T n VP.C n yp.(3R.T) n VP.(VP.C) n VP(3PT)), 
where C represents the concept 

ViT^VP- 1 .^- 1 .^)). 

This concept is clearly not satisfiable: w has to be an instance of C, which implies that x 
is an instance of ->A. This is inconsistent with x being an instance of A. 

Since P is a transitive role, all universal value restrictions over P are propagated from 
y to z, hence y and z are labelled with the same constraints and hence z is blocked by y. 
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Figure 6.2 A tableau where dynamic blocking is crucial 



x •L(x) 



{A,...} 



S 




y*L(y) = {3i?.T,3P.T,Vi?.C, 
/ \ yP.(3R.T),\fP.(3P.T),VP.(\fR.C)} 

\P e NR+ 



v • L(v) = {C} z •L(z) = L(y) z blocked by y 




w • 



If the blocking of z was not subsequently broken when \/P~ 1 .(\/S~ 1 .-iA) is added to L(y) 
from C G L(f ), then ->A would never be added to L(x) and the unsatisfiability would not 
be detected. 

As well as allowing blocks to be broken, it is also necessary to continue with some 
expansion of blocked nodes, because Vi?.C concepts in their labels could affect other parts 
of the tree. Again, let us consider the example in Figure 6.2. After the blocking of z is 
broken and VP~ 1 .(V5'~ 1 .-iA) is added to L(z) from C G L(w), z is again blocked by y. 
However, the universal value restriction VP -1 .(VS' -1 .-iA) G L(z) has to be expanded in 
order to detect the unsatisfiability. 

These problems are overcome by using dynamic blocking: using label equality as block- 
ing criterion and allowing blocks to be dynamically established and broken as the expansion 
progresses, and continuing to expand WR.C concepts in the labels of blocked nodes. 

Refined blocking 

As mentioned before, in (Horrocks & Sattler, 1999), blocking of nodes is based on label 
equality. This leads to major problems when trying to establish a polynomial bound on 
the length of paths in the completion tree. If a node can only be blocked by an ancestor 
when the labels coincide, then there could potentially be exponentially many ancestors in a 
path before blocking actually occurs. Due to the non-deterministic nature of the expansion 
rules, these subsets might actually be generated; the algorithm would then need to store 
the node labels of a path of exponential length, thus consuming exponential space. 

This problem is already present when one tries to implement a tableau algorithm for 
the logic ACCr+ (Sattler, 1996a), where the non-deterministic nature of the expansion rules 
for disjunction might lead to the generation of a chain of exponential size before blocking 
occurs. Consider, for example, the concept 



C = 3R.D nVR.(3R.D) 

D = (Ai U fix) n (A 2 U B 2 ) n • • • n (A n U B n ) 
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where R is a transitive role. The concept C causes the generation of a chain of .R-successors 
for all of which D is asserted. There are 2 n possible ways of expanding D because for every 
disjunctive concept AiUBi the — » u -rule can choose to add Ai or Bi. The completion tree for 
D is only complete once one node of this path is blocked and all unblocked nodes (including 
the blocking node) are fully expanded. For A£Cr+, a polynomial bound on the length of 
paths is obtained by applying a simple strategy: a new successor is only generated when 
no other rule can be applied, and propositional expansion of concepts only takes place if 
universal restrictions have been exhaustively been dealt with. Once a node is blocked, it is 
not necessary to perform its propositional expansion because it has already been ensured 
at the blocked node that such an expansion is possible without causing a clash. 

However, in the presence of inverse roles, this strategy is no longer possible. Indeed, the 
expansion rules for SL as they have been presented in (Horrocks & Sattler, 1998) based on 
set equality might lead to a tableau with paths of exponential length for C — even though 
C does not contain any inverse roles. This is due to the fact that blocking is established 
on the basis of label equality. Since the label of the blocked and blocking node must be 
equal, this implies that, since the label of the blocking node must be fully expanded, this 
also must hold for the label of the blocked. Since there are 2 n possibilities for such an 
expansion, it might indeed take a path of 2™ + 1 nodes before such a situation necessarily 
occurs and the completion tree is complete. 

In order to obtain a tableau algorithm that circumvents this problem and guarantees 
blocking after a polynomial number of steps, we will keep the information that is relevant 
for blocking separated from the "irrelevant" information (due to propositional expansion) 
in a way which allows for a simple and comprehensible tableau algorithm. In the following, 
we will explain this "separation" idea in more detail. 

Figure 6.3 Refined blocking 



Figure 6.3 shows a blocking situation. Assume node y to be blocked by node x. When 
generating a model from this tree, the blocked node y will be omitted and y' will get 
S'-successor, which is indicated by the backward arrow. On the one hand, this construction 
yields a new S-successor x of y', a situation which is taken care of by the subset blocking 
used in the normal ACCr+ tableau algorithms. On the other hand, x receives a new S'-- 
successor y'. Now blocking has to make sure that, if x must satisfy a concept of the form 




y • B(y) C L(x), L(y)/lnv(S) - L(x)/lnv(S) 
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VS^.D, then D (and VS^.D if S is a transitive role) is satisfied by y'. 

This was dealt with by equality blocking in (Horrocks & Sattler, 1999). In the following 
algorithm it will be dealt with using two labels per node and a modified blocking condition 
that takes these two labels into account. In addition to the label L, each node now 
has a second label B, where the latter is always a subset of the former. The label L 
contains complete information, whereas B contains only information relevant to blocking. 
Propositional consequences of concepts in L and concepts being propagated "upwards" in 
the tree are stored in L only, as they are not important for blocking as long as they are not 
universal restrictions that state requirements on the predecessor in the completion tree. 
The modified blocking condition now looks as follows. For a node y to be blocked by a 
node x we require that 

• the label B(y) of the blocked node y is contained in the label L(x) of the blocking 
node x. Expansions of disjunctions are only stored in L and thus cannot prevent a 
node from being blocked. 

• if ?/ is reachable from its predecessor in the completion tree via the role S, then the 
universal restrictions along Inv(S') asserted for y are the same as those asserted for 
x. This takes care of the fact that the predecessor y' of the blocked node y becomes 
a new lnv(S')-successor of the blocking node x. 

Summing up, we build a completion tree in a way that, for all nodes x, 

• we have B(x) C L(x), 

• B(x) contains only concepts which move down the tree, 

• L(x) contains, additionally, all concepts which move up the tree, and 

• expansion of disjunctions and conjunctions only affects L(x). 

6.1.3 A Tableau Algorithm for SI 

We now present a tableau algorithm derived from the one presented in (Horrocks & Sattler, 
1999). We shape the rules in a way that allows for the separation of the concepts which are 
relevant for the two parts of the blocking condition. For ease of construction, we assume 
all concepts to be in negation normal form (NNF), that is, negation occurs only in front of 
concept names. Any iSZT-concept can easily be transformed into an equivalent one in NNF 
in the same way as this can be done for ^4CC-concepts (Definition 3.1). 

The soundness and completeness of the algorithm will be proved by showing that it 
creates a tableau for C. In contrast to the approach we have taken in the previous chapters, 
where a constraint system stood in direct correspondence to a model, here we introduce 
tableaux as intermediate structures that encapsule the transition from the syntactic object 
of a completion tree to the semantical object of a model and takes care of the transitive 
roles on that way. This makes it possible for the algorithm to operate on trees even though 
SI does not have a genuine tree model property due to its transitive roles. 
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Definition 6.2 (A Tableau for SL) 

If C is an SZ-concept in NNF and NR^ is the set of roles occurring in C together with 
their inverses, a tableau T for C is a triple (S, C, £) such that S is a non-empty set, 
C : S — > 2 sub (°^ maps each element of S to a subset of sub(C), and £ : NR C — > 2 SxS maps 
each role in NR C to a set of pairs of individuals. In addition, the following conditions must 
he satisfied: 

(Tl) There is an s G S with C G C(s), and 

for all s,teS, A, C u d, D G sub(C), and R G NR C , 

(T2) if A G £{s), then ->A <£ C(s), for A G NC, 

(T3) if d n C 2 G C(s), then C x G C(s) and C 2 G C(s), 

(T4) ifd UC 2 e £{s), then d G £{s) or C 2 G £{s), 

(T5) if3R.D G £(s), then there is some t G S such that (s,t) G E(R) and D G C(t), 

(T6) ifVR.D G C(s) and (s,t) G £(R), then D G £{t), 

(T7) ifVR.D G £(s), (s,t) G E(R) and Trans( J R), tiien VR.D G aiid 

(Ta; (s,t) G £(i?) iff(*,s) G £(lnv(i?)). o 

A tableau T for a concept C is a "syntactic witness" for the satisfiability of C: 
Lemma 6.3 

An Si-concept C is satishable iff there exists a tableau for C . 

Proof. For the ^/-direction, if T = (S,C, £) is a tableau for C with C G C(s ), a model 
J = (A x , • J ) of C can be defined as follows: 



where £(-R) + denotes the transitive closure of 8,(R). Transitive roles are interpreted by 
transitive relations by definition. By induction on the structure of concepts, we show that, 



if D G C(s), then s G D 1 . This implies C x ^ because s G C 1 . Let D G C(s): 

1. If -D = A G NC is a concept name, then s G D x by definition. 

2. If D = ->A for A G NC then A <£ C(s) (due to (T2) ), so s G A 1 \ A 1 = D 1 . 

3. If D = (d n C 2 ), then, due to (T3), d e C(s) and d e C(s), and hence, by 
induction, s G Cf and s G . Thus, s G (Ci n d) 2 . 



A 1 



5, 

{s \ Ae C(s)} for all concept names A in sub(C), 
f £(i?)+ ifTrans(i?) 
1 £(-R) otherwise, 
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4. The case D = (Ci U C 2 ) is analogous to the previous one. 

5. If D — 3R.E, then, due to (T5), there is some t G S such that (s,t) G £(-R) and 
E G C(t). By definition of X, (s, t) G i? 1 holds as follows. It is immediate, if R G NR. 
If R = S- 1 for 5 G NR, then (s,t) G E(R) implies (t, s) G E(S) by (T8). Hence, 
(£, s) G and (s,t) G -R x holds. By induction, t e E J and hence s G (3R.E) 1 . 

6. If £> = (Vi?.£) and (s,t) G i? x , then either 

(a) (s,t) G £(i?) and E G £(t) (due to (T6)), or 

(b) (s, t) ^ E(R). Due to (T8), this can only be the case if i? is transitive and there 
exists a path of length n > 1 such that (s, si), (si, s 2 ), . . . , (s n , t) G £(i?). Due 
to (T7), VR.E G £(sj) for all 1 ^ % ^ n, and we have E G £(£), again due to 



In both cases, by induction t G E 1 holds, and hence s G (yR.E) 1 . 

For the converse direction, if J = (A x , • :r ) is a model of C, then a tableau T = (5, £, £) 
for C can be defined by: 



It remains to demonstrate that T is a tableau for C: 

1. T satisfies (Tl) - (T6) and (T8) as a direct consequence of the semantics of SI 
concepts and of inverse roles. 

2. If s G (VR.D) 1 , (s,t) G R x and Trans( J R), then t G (VR.D) 1 unless there is some u 
such that (t,u) G -R x and u £ D x . However, if (s,t) G R 1 , (t,u) G R 1 and Trans(_R), 
then (s,m) G -R j , which would imply s £ (VR.D) 1 . T therefore satisfies (T7). ■ 

6.1.4 Constructing an SI Tableau 

From Lemma 6.3, it follows that an algorithm that constructs a tableau for an ST-concept 
C can be used as a decision procedure for the satisfiability of C. Such an algorithm will 
now be described. 

Like the tableau algorithms that we have studied so far, the algorithm for SI works 
by manipulating a constraint system. In the presence of blocking, and especially in the 
case of the refined blocking we are using for SI, it is more convenient to emphasise the 
graph structure of the constraint system and deal with an edge- and node-labelled graph 
instead of an ABox. In case of the <SZT-algorithm, the constraint system has the form of a 
completion tree. 



(T6). 



E{R) 
C(s) 



S 



A x , 

{D G sub(D) | s G D 1 }. 
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Algorithm 6.4 (The ST-algorithm) 

Let C be an SZ-concept in NNF to be tested for satisfiability and NRc the set of roles that 
occur in C together with their inverse. A completion tree T = (V, E, L, B) is a labelled 
tree in which each node x G V is labelled with two subsets L(x) and B(x) of sub(C). 
Furthermore, each edge (x,y) G E in the tree is labelled L(x,y) = R for some (possibly 
inverse) role R G NRc. Nodes and edges are added when expanding 3R.D and 3R~ l .D 
constraints; they correspond to relationships between pairs of individuals and are always 
directed from the root node to the leaf nodes. The algorithm expands the tree by extending 
L(x) (and possibly B(x) ) for some node x, or by adding new leaf nodes. 

A completion tree T is said to contain a clash if, for a node x in T, it holds that there 
is a concept name A such that {A, ->A} C L(x). 

If nodes x and y are connected by an edge (x, y) G E, then y is called a successor of x 
and x is called a predecessor of y. If~L(x, y) = R, then y is called an i?-successor of x and 
x is called an lnv(i?)-predecessor of y. 

Ancestor is the transitive closure of predecessor and descendant is the transitive closure 
of successor. A node y is called an i?-neighbour of a node x if either y is an R-successor 
of x or y is an R-predecessor of x. 

To define the blocking condition we need the following auxiliary dehnition. For a 
(possibly inverse) role S G NR C , we define the set L(y)/S by 

L(y)/S={VS.DeL(y)}. 

A node y is blocked if for some ancestor x of y, x is blocked or 

B(y) C L(x) and L(y)/lnv(S) = L{x)/\nv(S) 

for the unique predecessor y' of y in the completion tree, L(y',y) = S holds. 

The algorithm initializes a tree T to contain a single node Xq, called the root node, 
with L(xo) = B(xo) = {C}. T is then expanded by repeatedly applying the rules from 
Figure 6.4. 

The -^3-rule is called generating; all other rules are called non-generating. 

The completion tree is complete if, for some node x, L(x) contains a clash or if none 
of the expansion rules is applicable. If the expansion rules can be applied in such a way 
that they yield a complete, clash-free completion tree, then the algorithm returns "C is 
satisfiable" ; otherwise, the algorithm returns "C is not satishable" . o 

Like for all other tableau algorithms studied in this thesis, it turns out (see the proof 
of Lemma 6.8) that the choice of which rule to apply where and when is don't-care non- 
deterministic — no choice can prohibit the discovery of a complete and clash-free completion 
tree for a satisfiable concept. On the other hand, as before, the choice of the — > u -rule is 
don't-know non-deterministic — only certain choices will lead to the discovery of a complete 
and clash-free completion tree for a satisfiable concept. For an implementation this means 
that an arbitrary strategy that selects which rule to apply where will yield a complete 
implementation but exhaustive search is required to consider the different choices of the 
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Figure 6.4 Tableau expansion rules for SI 

if 1. Ci n C 2 G Ux) and 

2. {C u C 2 }<mx) 
then L(x) ^ n L(x) U {Ci, C 2 } 

^u- if 1. Ci U C 2 e L(x) and 
2. {d,C 2 }nL(x) =0 
then L(x) ^ u L(x) U {E} for some £ G {C 1} C 2 } 

if 1. VR.D G L(x) and 

2. there is an /^-successor y of x with D ^ B(y) 
then L(y) ^ v L(y) U {D} and B(y) ^ v B(y) U {D} or 

2'. there is an i?-predecessor y of x with D ^~L(y) 
then L(y) ^ v L(y) U {D} and delete all descendants of y. 

^v+ : if 1- VR-D G L(x) and Trans(i?) and 

2. there is an /^-successor y of x with ^JR.D ^ B(y) 
then L(y) L(y) U {iR.D} and B(y) B(y) U {\fR.D} or 

2'. there is an i?-predecessor y of x with WR.D L(y) 
then L(y) L(y) U {V-R.-D} and delete a// descendants of y. 

if 1. 3R.D G L(x), x is not blocked and no non-generating rule 
is applicable to x and any of its ancestors, and 
2. x has no i?-neighbour y with D G B(y) 
then create a new node y with L(x, y) = i? and L(y) = B(y) = {.D} 



^ u -rule. A similar situation exist in case of the iSHTQ-algorithm in Section 6.3, where it 
follows from the proof of Lemma 6.36 that choice of which rule to apply where is don't-care 
non-deterministic. 

Note that in the definition of successor and predecessor, the tree structure is reflected. 
If y is an i?-successor of x than this implies that y is successor of x in the completion tree 
and it is not the case that x is an lnv(i?)-predecessor of y. Successor and predecessor always 
refer to the relative position of nodes in the completion tree. This is necessary because, 
in the construction of a tableau from a complete and clash-free completion tree, the edges 
pointing to blocked successors will be redirected to the respective blocking nodes, which 
makes the relative position of nodes in the completion tree significant. 

We are aiming for a PSPACE-decision procedure, so, like the^4/jCQZ&-algorithm (Algo- 
rithm 4.21), the — >y- and ^v + - r ules delete parts of the completion tree whenever informa- 
tion is propagated upward in the completion tree to make tracing possible. 
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Correctness 

As before, correctness of the algorithm will be demonstrated by proving that, for an SL- 
concept C, it always terminates and that it returns "satisfiable" if and only if C is satisfi- 
able. To prove this, we follow a slightly different approach than the one that is indicated 
by Theorem 3.6. The reason for this is that it is unclear how to deal with blocked nodes 
when trying to define a suitable notion of satisfiability for a completion tree. We will come 
back to this topic. 

Before we start proving the properties we need to establish correctness of the SI- 
algorithm, let us state an obvious property of the ST-algorithm: 

Lemma 6.5 

Let T be a completion tree generated by the SZ-algorithm. Then, for every node x of T, 
B(x) C L(x). 

Proof. Obviously, B(rr ) C L(xo) holds for the only node Xq of the initial tree. Subse- 
quently, whenever a concept D is added to B(x) by an application of one of the rules, then 
it is always also added to L(x). ■ 

We first show termination of the algorithm: 
Lemma 6.6 (Termination) 

For each SE-concept C, the tableau algorithm terminates. 

Proof. Let m = §sub(C). Obviously, m is linear in the length of C. Termination is a 
consequence of the following properties of the expansion rules: 

1. The expansion rules never remove concepts from node labels. 

2. Successors are only generated for concepts of the form 3R.D and, for any node, each 
of these concepts triggers the generation of at most one successor. Since sub{C) 
contains at most m concepts of the form 3R.D, the out-degree of the tree is bounded 
by m. 

3. Nodes are labelled with nonempty subsets of sub(C). If a path p is of length > 2 2m , 
then there are 2 nodes x,y on p, with L(x) = L(y) and B(x) = B(y), and blocking 
occurs. Since a path on which nodes are blocked cannot become longer, paths are of 
length at most 2 2m + 1. 

An infinite run of the completion algorithm can thus only occur due to an infinite 
number of deletions of nodes of the tree. That this can never happen can be shown in 
exactly the same way this has been done in the proof of Lemma 4.22. ■ 
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Lemma 6.7 (Soundness) 

If the Si-algorithm generates a complete and clash-free completion tree for a concept C, 
then C has a tableau. 

Proof. Let T = (V, E, L, B) be the complete and clash-free completion tree constructed 
by the tableau algorithm for C. A tableau T = (S, C, £) can be defined by 



We now show that T is a tableau for C. By definition of T, we have L = C for all x G S, 
so it is sufficient to establish the required properties for the function L. 

• C G L(x ) for the root x of T and, since x has no predecessors, it cannot be blocked. 
Hence C G L(s) for some s G S and (Tl) holds. 

• (T2) is satisfied because T is clash-free. 

• (T3) and (T4) are satisfied because neither the — > n -rule nor the ^ u -rule apply to 
any x G S. Hence L(x) satisfies the required properties. 

• (T5) is satisfied because, for all x G S, if 3R.D G L(x), then the — ^-rule ensures 
that there is either: 

1. an /^-predecessor y with D G B(y) C L(y) (see Lemma 6.5). Because y is a 
predecessor of x (which is an unblocked node), it cannot be blocked, so y G S 
and (x, y) G E(R). 

2. an /^-successor y with D G B(y) C L(y) (again, see Lemma 6.5). If y is not 
blocked, then y G S and (x,y) G E(R). Otherwise, y is blocked by some z with 
B(y) C L(z). Hence D G L(z), z 6 5 and (x, z) G E(R). 

• To show that (T6) is satisfied for all x G S, if VR.D G L(x) and (x, y) G £(/?), we 
have to consider three possible cases: 

1. y is an i?- neighbour of a;. The — >y-rule guarantees D G L(y). 

2. L(x, z) = R, y blocks z. Then by the ^v _ru l e we have D G B(z) and, by the 
definition of blocking, B(z) C L(y). Hence D G L(j/). 

3. L(y, z) = lnv(i?), a: blocks z. From the definition of blocking we have that 
L(z)/R = L(x)/R. Hence WR.D G L(z) and the ^ v - ru l e guarantees D G L(?/). 

In all three cases, Z) G L(y) holds. 



£(i2) 



5 



{x | x is a node in T, and x is not blocked}, 
L| 5 , 

{(x, y) G 5 x S | 1. y is an i?-neighbour of x or 



2. zb.L(x, z) = R and y blocks z or 

3. 3z.L(y, z) = lnv(i?) and x blocks z). 
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• For (T7), let x G S with \/R.D G L(x), (x,y) G E(R), and Trans(-R). There are three 
possible cases: 

1. y is an /^-neighbour of x. The — >v + -rule guarantees \/R.D G L(y). 

2. L(x, z) = R, y blocks z. Then, by the — >v + -rule, we have VR.D G B(z) and, by 
the definition of blocking, B(z) C L(y). Hence VR.D G L(y). 

3. L(y, = lnv(i?), x blocks From the definition of blocking, we have that 
L(z)/R = L(x)/R. Hence VR.D G L(z) and the — >v + -rule guarantees VR.D G 

L(y)- 

• (T8) is satisfied because, for each (x,y) G E(R), either: 

1. x is an i?-neighbour of y, so y is an lnv(i?)-neighbour of x and (y, x) G £(lnv(i?)). 

2. L(x, z) = R and y blocks z, so L(x, = lnv(lnv(_R)) and (y,x) G £(lnv(i?)). 

3. \j{y,z) = lnv(i?) and x blocks z, so (y,x) G £(lnv(i?)). ■ 

We have already mentioned that it is problematic to define a suitable notion of satisfi- 
ability for ST-completion trees (as it would be required by Theorem 3.6) due to blocking. 
As one can see, the blocked nodes of the completion tree do not play a role when defining 
the tableau, so (hidden) inconsistencies in the labels of indirectly blocked nodes should not 
prevent a complete and clash-free tree from being satisfiable. On the other hand, due to 
dynamic blocking, blocked nodes may become unblocked during a run of the algorithm, 
in which case inconsistencies in these nodes may prevent the discovery of a complete and 
clash-free tree. Consequently, for the completeness proof, we require all nodes, also the 
blocked ones, to be free from inconsistencies. Since we have not found a way to uniformly 
combine these two different approaches into a single notion of satisfiability for comple- 
tion trees, we give a proof for the correctness of the iSZT-algorithm that does not rely on 
Theorem 3.6. 

Lemma 6.8 

Let C he an SZ-concept in NNF. IfC has a tableau, then the expansion rules can he applied 
in such a way that the tableau algorithm yields a complete and clash-free completion tree 
for C. 

Proof. Let T = (<S, C, £) be a tableau for C. Using T, we guide the application of the 
non-deterministic — >u-rule such that the algorithm yields a completion tree T that is both 
complete and clash-free. The algorithm starts with the initial tree T consisting of a single 
node Xq, the root, with B(rr ) = L(rr ) = {D}. 

T is a tableau, hence there is some s G S with D G C(s ). When applying the 
expansion rules to T, the application of the non-deterministic — > u -rule is guided by the 
labelling in the tableau T. We will expand T in such a way that the following invariant 
holds: there exists a function n that maps the nodes of T to elements of S such that 
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L(x) C C(tt(x)) holds for all nodes x of T, and , , 

if L(x, y) = R then (vr(x), 7r(y)) G £(i?) for all nodes x, y in T. 

CLAIM 6.9 If (*) holds for a completion tree T and a rule is applicable to T, then it can 
be applied in a way that maintains (*). 

We have to distinguish the different rules: 

• If the ^n-rule can be applied to x in T with D = C\ V\ C2 G L(rr) Q C(ir(x)), then 
Ci,C2 are added to L(x). Since T is a tableau, {Ci,C2} C £(7r(x)), and hence the 
^n-rule preserves L(x) C C(ir(x)). 

• If the — » u -rule can be applied to x in T with D = C\ U C2 G L(x) C £(7r(x)), then 
there is an E G {Ci, C2} such that G C(tt(x)), and the — > u -rule can add E to L(x). 
Hence the — ►□-rule can be applied in a way that preserves L(x) C £(7r(x)). 

• If the — >3-rule can be applied to x in T with D = 3R.E G L(x) C £(7r(x)), then 
.D G £(7r(x)) and there is some t & S with (7r(x),t) G £(-R) and G L(t). The 
^3-rule creates a new successor y of x and we extend n by setting n' := n[y \— > t], 
i.e., 7r' is the extension of 7r that maps y to t It is easy to see that the extended 
completion tree together with the function it' satisfy (*). 

• If the ^v-mle can be applied to x in T with D = VR.E G L(x) C C(ir(x)) and y is 
an i?-neighbour of x, then (n(x),n(y)) G E(R), and thus E G L(7r(y)). The — >v-rule 
adds E to L(y) and thus preserves L(x) C £(7r(x)). The deletion of nodes can never 
violate (*). 

• If the -^ v+ - ru l e can be applied to x in T with D = VR.E G L(x) C C(ir(x)), 
Trans(i?), and y an _R-neighbour of x, then (7r(x), n(y)) G £(-R), and thus WR.E G 
C(n(y)). The — > v+ -rule adds Vi?.E to L(y) and thus preserves L(y) C £(7r(y)). The 
deletion of nodes can never violate (*). 

From this claim, the lemma can be derived as follows. It is obvious that the initial tree 
satisfies (*): since T is a tableau for C, there exists an element Sq G S with C G C(sq) 
and hence the function n that maps x to s satisfies the required properties. The claim 
states that, whenever a rule is applicable, it can be applied in a way that preserves (*). 
Obviously, no completion tree that satisfies (*) contains a clash as this would contradict 
(T2). Moreover, from Lemma 6.6, we have that the expansion process terminates and thus 
must eventually yield a complete and clash-free completion tree. ■ 

Theorem 6.10 

The SZ-algorithm is a non-deterministic decision procedure for satisfiability and subsump- 
tion of SZ-concepts. 



6. 1 Transitive and Inverse Roles: SI 



109 



Proof. Theorem 6.10 is an immediate consequence of Lemma 6.3, 6.6, 6.7, and 6.8. 
Moreover, since SI is closed under negation, subsumption of concepts C □ D can be 
reduced to the (un-) satisfiability of C n ->D. ■ 



6.1.5 Complexity 

In Lemma 6.6 we have seen that the depths of a completion tree generated by the SI- 
algorithm is bounded exponentially in the size of the input concept. To show that the 
algorithm can indeed be implemented to run in polynomial space, we need to carry out a 
closer analysis of the length of paths in a completion tree. 

In Lemma 6.11 and 6.12, we establish a polynomial bound on the length of paths in 
the completion tree in a manner similar to that used for the modal logic S4 and ACCr+ in 
(Halpern & Moses, 1992; Sattler, 1996a). It then remains to show that such a tree can be 
constructed using only polynomial space. 

Lemma 6.11 

Let C be an Si-concept and m = §sub(C), n > m 3 , and R G NRf be a role with Trans(-R). 
Let xi, . . . , x n be successive nodes of a completion tree generated for C by the Si-algorithm 
with h(xi, x i+ i) = R for 1 < i < n. If the — >y- or the —>\/ + -rules cannot be applied to these 
nodes, then there is a blocked node Xi among them. 

Proof. For each node x of the completion tree, B(x) only contains two kinds of concepts: 
the concept that triggered the generation of the node x, denoted by C x , and concepts which 
were propagated down the completion tree by the first alternative of the — >v - or ^v + _ru l es - 
Moreover, B(x) C L(x) holds for any node in the completion tree. 

Firstly, consider the elements of B(xj) for i > 1. Let C Xi denote the concept that 
caused the generation of the node Xj. Then B(xj) — {C Xi } contains only concepts which 
have been inserted using the — »y- or the ^ v+ -rule. Let D e B(xi) — {C Xi }. Then either 
^JR.D G L(xj_i) and the -^v + _ru l e makes sure that ^JR.D G B(xj), or D is already of the 
from VR.D' and has been inserted into B(xj) by an application of the ^v + _ru l e to 
In both cases, it follows that the — >y- or the ^ v+ - ru l e yield D G B(x i+1 ). Hence we have 

Bfa) - {C Xi } C B(x i+1 ) for all 1 < i < n, 

and, since we have m choices for C Xi , 

${B(xi) | 1 < i < n} < m 2 . 

Secondly, consider L(xi)/\n\f(R). Again, the — >v- and the ^v+-rules yield 

L(xi)/\rw(R) C L(xi_i)/lnv(i2) for all 1 < i < n, 

which implies 

tt{L(xi)/lnv(i2) \ 1 < i < n} < m. 
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Summing up, within m 3 + 1 nodes, there must be at least two nodes Xj,Xk which satisfy 

B(xj) = B(x fc ) and L(x j )/\nv(R) = L(x k )/\nv(R). 
This implies that one of these nodes is blocked by the other. ■ 

We will now use this lemma to give a polynomial bound on the length of paths in a 
completion tree generated by the completion rules. 

Lemma 6.12 

The paths of a completion tree generated by the SZ-algorithm for a concept C have a 
length of at most m 4 where m = $sub{C). 

Proof. Let T be a completion tree generated for C by the ST-algorithm. For every node 
x of T we define £(x) = max{\D\ | D G L(x)}. If x is a predecessor of y in T, then 
this implies £(x) > £{y). If not Trans(i?) and h(x,y) = R, then this implies £(x) > £(y). 
Furthermore, for R 1 ^ R 2 (but possibly Ri = lnv(i? 2 )), L(x,y) = R 1 and L(y,z) = R 2 
implies £(x) > £(z). 

The only way that the maximal length of concepts does not decrease is along a pure 
i?-path with Trans(i?). However, the — >y- and the — »v + -rule must be applied before the 
^3-rule may generate a new successor. Together with Lemma 6.11, this guarantees that 
these pure i?-paths have a length of at most m 3 . 

Summing up, we can have a path of length at most m 3 before decreasing the maximal 
length of the concept in the node labels (or blocking occurs), which can happen at most 
m times and thus yields an upper bound of m 4 on the length of paths in a completion 
tree. ■ 

Note that the extra condition for the — >3-rule, which delays its application until no 
other rules are applicable, is necessary to prevent the generation of paths of exponential 
length. Consider the following example for some R with Trans(i?): 

C = 3R.D n VR.(3R.D) n ViT 1 .^ 

D = (yR- 1 .A 1 u VR- 1 ^) n • • • n (ViT 1 .^ U Vi2 _1 .S n ) 

When started with a root node x labelled B(rr ) = L(x ) = {C}, the tableau algorithm 
generates a successor node x\ with 

B(xi) = {D,3R.D,VR.(3R.D)} 

which, in turn, is capable of generating a further successor x 2 with B^) = B(xi). Without 
blocking, this would lead to an infinite path in the completion tree. Obviously, for x\ and 
#2, the first part of the blocking condition is satisfied since B(x 2 ) Q B(xi). However, the 
second condition causes a problem since, in this example, we can generate 2 n different sets 
of universal restrictions along R^ 1 for each node. If we can apply the — ^-rule freely, then 
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the algorithm might generate all of these 2™ nodes to find out that (after finally applying 
the -^v+- m le that causes propagation of the concepts of the form Vi? _1 .i? upward in the 
tree) within the first n + 1 nodes on this path there is a blocked one. 

Lemma 6.13 

The SZ-algorithm can be implemented in PSpace. 

Proof. Let C be the <ST-concept to be tested for satisfiability. We can assume C to be in 
NNF because every iSZT-concept can be turned into NNF in linear time. 

Let m = §sub(C). For each node x of the completion tree, the labels L(x) and B(x) 
can be stored using m bits for each set. Starting from the initial tree consisting of only a 
single node Xq with L(rr ) = B(rr ) = {D}, the expansion rules, as given in Figure 6.4, are 
applied. If a clash is generated, then the algorithm fails and returns "C is unsatisfiable" . 
Otherwise, the completion tree is generated in a depth-first way: the algorithm keeps 
track of exactly one path of the completion tree by memorizing, for each node x, which of 
the 3i?.-D-concepts in L(x) successors have yet to be generated. This can be done using 
additional m bits for each node. The "deletion" of all successors in the — >y- or the — >v + -rule 
of a node x is then simply realized by setting all these additional bits to "has yet to be 
generated" . There are three possible results of an investigation of a child of x: 

• A clash is detected. This stops the algorithm with "C is unsatisfiable" . 

• The — >y- or the — > v+ -rule leads to an increase of L(x). This causes reconsideration 
of all children of x, re-using the space used for former children of x. 

• Neither of these first two cases happens. We can then forget about this subtree and 
start the investigation of another child of x. If all children have been investigated, 
we consider x's predecessor. 

Proceeding like this, the algorithm can be implemented using 2m + m bits for each 
node, where the 2m bits are used to store the two labels of the node, while m bits are 
used to keep track of the successors already generated. Since we reuse the memory for 
the successors, we only have to store one path of the completion tree at a time. From 
Lemma 6.12, the length of this path is bounded by m 4 . Summing up, we can test for the 
existence of a completion tree using C(m 5 ) bits. 

Unfortunately, due to the ^ u -rule, the ST-algorithm is a non-deterministic algorithm. 
However, Savitch's theorem (Savitch, 1970) tells us that there is a deterministic implemen- 
tation of this algorithm using at most C(m 10 ) bits, which is still a polynomial bound. ■ 

Since ACC is a fragment of SI, satisfiability of ST-concepts is PSPACE-hard, which 
yields: 

Theorem 6.14 

Satisfiability and subsumption of SI -concepts is PSPACE-compIete. 
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There is an immediate optimization of the algorithm which has been omitted for the 
sake of the clarity of the presentation. We have only disallowed the application of the — >g- 
rule to a blocked node, which is sufficient to guarantee the termination of the algorithm. It 
is also possible to disallow the application of more rules to a blocked node without violating 
the soundness or the completeness of the algorithm, if the notion of blocking is slightly 
adapted. It then becomes necessary to distinguish directly and indirectly blocked nodes. 
More details can be found in (Horrocks & Sattler, 1998). The technique presented there 
will stop the expansion of a blocked node earlier during the runtime of the algorithm and 
hence will save some work. 



6.2 Adding Role Hierarchies and Qualifying Number 
Restrictions: SHLQ 

In this section, we study aspects of reasoning with the DL SHLQ, i.e., SI extended with 
qualifying number restrictions and role hierarchies. Qualifying number restrictions have 
already been introduced in Chapter 4 and require no further discussion. Role hierarchies 
(1997), which have already been present in early DL systems like BACK (Quantz & Kin- 
dermann, 1990) , allow to express inclusion relationships between roles. For example, role 
hierarchies can be used to state that the role has_child is a sub-role of has_of f spring. 
This makes it possible to infer that the child of someone whose offsprings are all rich must 
also be rich. 

The combination of role hierarchies with transitive roles is particularly interesting be- 
cause it allows to capture various aspects of part- whole relations (Sattler, 2000). It is 
also interesting because it is sufficiently expressive to internalise general TBoxes (Baader, 
1991; Schild, 1991; Baader et al., 1993), i.e., it allows for a reduction of concept satisfi- 
ability w.r.t. general TBoxes to pure concept satisfiability — always an indication for high 
expressive power of a Description Logic. 

After defining syntax and semantics of SHZQ, we show how internalisation of general 
axioms can be accomplished. We then determine the worst-case complexity of satisfiability 
for SHZQ-concepts as ExpTlME-complete even if numbers in the input are in binary coding. 
This is achieved by a reduction from SHLQ to A/XQZb, where role conjunction and general 
TBoxes are used to simulate role hierarchies and transitive roles. 

While this reduction helps to determine the exact worst-case complexity or the problem, 
one cannot expect to obtain an efficient algorithm from it. The reason for this is that it 
relies on the highly inefficient automata construction used to prove Theorem 4.38. To 
overcome this problem, we present a tableau algorithm that decides satisfiability of SHZQ 
concepts. In the worst case, this algorithm runs in 2-NExpTime. Yet, it is amenable 
to optimizations and is the basis of the highly optimized DL system iFaCT (Horrocks, 
1999), a offspring of the FaCT system (Horrocks, 1998), which exhibits good performance 
in system comparisons (Massacci & Donini, 2000; Horrocks, 2000). 
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6.2.1 Syntax and Semantics 

Definition 6.15 (Syntax and Semantics of SHEQ) 

Let NR be a set of atomic role names, and NR + C NR a set of transitive role names. The 
set of SHEQ-roles is defined as the set of SI -roles by NR := NR U {i?" 1 \ R G NR}. A role 
ReMR is called transitive iff R G NR + or lnv(i?) G NR+. 

A role inclusion axiom is of the form R □ S, for two SHEQ-roles R and S. A set of role 
inclusion axioms TZ is called a role hierarchy. We define as the transitive- reflexive of 
the relation TZ U {lnv(i?) C \m(S) \ R^S eTZ}. IfR^*S then R is called a sub-role of S 
and S is called a super-role of R (w.r.t. TZ). 

A role R is called simple with respect to TZ iff R does not have a transitive sub-role. 

Let NC be a set of concept names. The set of SHEQ-concepts is built inductively from 
these using the following grammar, where A G NC, n G N, R G NR is an arbitrary role, 
and S G N R is a simple role: 

C ::= A | | d n C 2 | d U C 2 | \/R.C | 3R.C | (>n 5 C) | (<n 5 C). 

An interpretation X = (A 1 , - 1 ) consists of a non-empty set A 1 and a valuation - J that 
maps every concept name A to a subset A 1 C A 1 and every role name R to a binary relation 
R x C A 1 x A 1 with the additional property that every transitive role name R G NR + is 
interpreted by a transitive relation. 

Such an interpretation is inductively extended to arbitrary SHLQ-concepts in the usual 
way. (See Definition 4.17). 

An interpretation X satisfies a role hierarchy TZ iff R x C S 1 for each R C S G TZ; we 
denote this fact by X \= TZ and say that X is a model of TZ. 

A concept C is satisfiable with respect to a role hierarchy TZ iff there is some interpre- 
tation X such that I \= TZ and C x ^ 0. Such an interpretation is called a model of C w.r.t. 
TZ. A concept D subsumes a concept C w.r.t. TZ (written C D) iff C x C D 1 holds for 
each model X of TZ. For an interpretation X, an individual x G A 1 is called an instance of 
a concept C iff x G C 1 . 

Satishability of concepts w.r.t. TBoxes and role hierarchies is defined in the usual way. 

o 

As shown in (Horrocks et al., 1999), the restriction of qualifying number restrictions to 
simple roles is necessary to maintain decidability of SHEQ. Without this restriction, it is 
possible to reduce an undecidable tiling problem (Berger, 1966) to concept satisfiability. 

Internalization of TBoxes 

An evidence of SHEQ's high expressivity is the fact that it allows for the internalization of 
TBoxes using a "universal" role U, that is, a transitive super-role of all relevant roles. 
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Lemma 6.16 

Let C be a SHLQ-concept, ft a role hierarchy, and T a SHLQ-TBox. Define 

C r -= |~| -CiLJA, 

and let U G NR + he a transitive role that does not occur in T, C, or ft. We set 

TZu := ft U {R C 17, Inv(ft) □ f/ | i? occurs in T, C, or ft}. 

Then C is satisfiable w.r.t. T and ft iff C n CV n Vf/.CV is satisfiable w.r.t. IZy. 

Note that augmenting ft to obtain ftj/ in this manner does not turn simple roles into 
non-simple roles. The proof of Lemma 6.16 is similar to the ones that can be found in 
(Schild, 1991; Baader, 1991). Most importantly, it must be shown that, (a) if a SHLQ- 
concept C is satisfiable with respect to a TBox T and a role hierarchy ft, then C, T have a 
connected model, i.e., a model where all elements are connected by roles occurring in C or 
T, and (b) if y is reachable from x via a role path (possibly involving inverse roles), then 
(x, y) G U 1 . These are easy consequences of the semantics of SHLQ and the definition of 
TZu- As a corollary, we get: 

Theorem 6.17 

Satisfiability and subsumption of SHLQ-concepts w.r.t. general TBoxes and role hierarchies 
are polynomially reducible to (un) satisfiability of SHLQ-concepts w.r.t. role hierarchies. 

Cycle-free Role Hierarchies 

In what we have said so far, a role hierarchy ft may contain a cycle, i.e., there may be roles 
R,S e NR with R^ S, S\Z*R, and R\Z*S. Such cycles would add extra difficulties to the 
following considerations. The next lemma shows that, w.l.o.g., we only need to consider 
role hierarchies that are cycle-free. 

Lemma 6.18 

Let C be a SHLQ-concept and ft a role hierarchy. There exists a SHLQ-concept C and 
role hierarchy ft' polynomially computable from C, ft such that ft' is cycle free and C is 
satisfiable w.r.t. ft iff C is satisfiable w.r.t. ft'. 

Proof. The set ft can be viewed as a directed graph G = (V, E) with vertices V = {R \ 
R occurs in ft } and E — {(S, R) \ S □ R e ft}. The strongly connected components of G 
can be calculated in quadratic time. For every non-trivial strongly connected component 

. . . R k }, select an arbitrary S G . . . , Rk} such that S G NR + if {R 1: . . . , R k } D 
NR + 7^ 0. For every 1 < i < k, replace Rk in C and ft by S. The results of this replacement 
are called C' and ft', respectively. It is obvious that these can be obtained from C, ft in 
polynomial time and that (ft') + is cycle-free. 

For every X with X |= ft, it easy to see that, for every strongly connected component 

...,R k }of G, Rf = R* holds for every 1 < i, j < k and, if {R u ...,R k }D NR + ^ 0, 
then ii~ is transitive for every 1 < i < k. Hence, C is satisfiable w.r.t. ft iff C is satisfiable 
w.r.t. ft' ■ 
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Thus, from now on, we only consider cycle-free role hierarchies. 
6.2.2 The Complexity of Reasoning with SHLQ 

So far, the exact complexity of reasoning with SHLQ has been an open problem. It was clear 
that it is ExpTlME-hard as a corollary of Theorem 3.18 and this also holds for pure concept 
satisfiability by Theorem 6.17. Following De Giacomo's ExpTlME-completeness result for 
the DL CLQ (1995), it has been conjectured that the problem can be solved in ExpTime. 
Yet, the results from (De Giacomo, 1995) are valid for unary coding of numbers only and 
do not easily transfer to SHLQ because of the presence of role hierarchies. Here, we verify 
the conjecture by giving a polynomial reduction of iSHTQ-satisfiability to satisfiability of 
^tCCQZ&-concepts w.r.t. general TBoxes. In Theorem 4.38, we have already shown that the 
latter problem can be solved in ExpTime, also for the case of binary coding of numbers 
in the input. Our reduction combines two techniques: 

1. To deal with a role hierarchy TZ, we replace every role R by the role conjunction 

n s. 

Note that, since R^*R, R occurs in R 1 * . This usage of role conjunction to express 
role hierarchies is common knowledge in the DL community but, to the best of our 
knowledge, there exists no publication that explicitly mentions it. 

2. For transitive roles, we shift the technique employed in the iSZT-algorithm to deal with 
transitive roles into a set of TBox axioms. For SL, transitive roles were dealt with 
by explicitly propagating assertions of the form VR.D to all /^-successors of a node 
x using the — > v+ -rule. This makes it possible to turn an ST-tableau into a model 
(which must interpret transitive roles with transitive relations) by transitively closing 
the role relations explicitly asserted in the tableau. Here, we achieve the same effect 
using a set of TBox axioms. A similar idea can be found in (de Nivelle, 2000). 

The usage of R^ to capture the role hierarchies is motivated by the following observa- 
tions. 

Lemma 6.19 

Let 1Z be a role hierarchy. If S^*R then (S^) 1 C (R^) 1 for every interpretation L. Also, 
for every interpretation L with I \=1Z, {R}) 1 = R 1 

Proof. If SO*R then {S' | SQ*S'} D {S' | RQ*S'} and hence (S T ) X C {R}) x . li I \= K 
then R x C S x for every S with RH*S. Hence (R)f = R 1 . m 
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The reduction that captures the transitive roles involves concepts from the following 

set: 

Definition 6.20 

For a SHEQ concept C and a role hierarchy 7Z, the set clos(C, 71) is the smallest set X of 
SHLQ-concepts that satisfies the following conditions: 



It is easy to see that, for a concept C and a role hierarchy 7Z, the set clos(C,lZ) is 
"small": 

Lemma 6.21 

For a SHEQ-concept C in NNF and a role hierarchy 71, $clos(C,7l) = 0(\C\ x \7Z\). 

Proof. Like in the proof of Lemma 4.9, it is easy to see that the smallest set X' that 
contains C and is closed under sub-concepts and ~ contains C(|C|) concepts. For SHEQ, 
we additionally have to add concepts WT.D to X' if WR.D E X' with TH*R and Trans(T), 
and again close X' under sub-concepts and ~ to obtain closiC, 71).. This may yield at most 
two concepts for every concept in X' and every role T that occurs in 7Z because, for every 
such concept WR.D e X' , it suffices to add WT.D and 3T.~LL Since it is a sub-concept of 
allRD, D does not need to be reconsidered in the closure process. ■ 

We now formally introduce the employed reduction from SHEQ to ACCQEb. 
Definition 6.22 

Let C be a SHEQ-concept in NNF and TZ a role hierarchy. For every concept WR.D e 
clos(C) let X R:D E NC a he unique concept name that does not occur in C. We define the 
function - tr inductively on the structure of concepts (in NNF) by setting 



• CeX, 



• X is closed under sub-concepts and ~, 3 and 



• if WR.D e X, TQ*R, and Trans(T), then WT.D e X. 



o 



A tr 



A for all AeNC 
-iA for all AeMC 

Cf n C* r 
Cf U cf 
(ixm i? T D tr ) 
Xr,d 

-"Xr^d 



(-4) 
(Ci n C 2 ) 
(Ci u C 2 ) 



tr 



tr 



tr 



(txm R D) 



tr 



{WR.D) 
(3R.D) 



tr 



tr 



3 Like in Chapter 4, with ~D we denote NNF(^D). 
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The TBox T c is defined by 

T c ={X R , D = VRlD tr | VR.D G clos(C,TZ)}U 

{X R , D C I! VT T .X TiD | MR.D G clos(C,TZ)} 

TC*i?,Trans(T) 

O 

Lemma 6.23 

Let C be a SHTQ-concept in NNF, TZ a role hierarchy, and - tr and Tc as defined in Defini- 
tion 6.22. C is satishable w.r.t. TZ iff the ACCQJb-concept C tr is satishable w.r.t. T c . 

Proof. For the only-if '-direction, let C be a SHZQ-concept and TZ a set of role axioms. 
Assume that J is a (SHLQ-) model of C w.r.t. TZ. Let X = {X R:D | MR.D G clos(C,TZ)} 
bet the set of freshly introduced concept names from Definition 6.22. 

We will construct an „4/jCQZ&-model T' for C tr and Tj from X by setting 

(X R , D f = (\/R.Df 

for every X R ^ D G X, and maintaining the interpretation of all other concept and role 
names. 

Claim 6.24 For all D G clos(C, TZ), {D tr ) T = D 1 . 

This claim is proved by induction on the structure of concepts. For the base case 
D = A G NCyA', A tr = A holds, and hence (A 1 ' 1 ") 1 = A 1 . For all other cases, except 
for D = \/R.E and D = 3R.E, the claim follows immediately by induction because R 1 = 
(R^) 1 = (R}) J> for every R G NR, since T \=TZ and because of Lemma 6.19. 

For D = VR.E, D tr = X R>E and by construction of J', (Xr^) 1 ' = (VR.E) 1 . For 
D = 3R.E, D tr = ^Xr^ e and ' 

(D tr f = A 1 ' \ (X R ^ E f = A x \ (VR.-E) 1 = (3R.E) 1 , 

which finishes the proof of the claim. 

In particular, since C 1 ^ 0, also (C tr ) J ^ 0. It remains to show that T' \= Tc holds. 
For the first set of axioms, this holds because 

(X R , D f = (VR.D) 1 = (VRlD tr f, 

since R 1 = (R^) 1 ' because of Lemma 6.19, and D x = (D tr ) 1 ' due to Claim 6.24. 
For the second set of axioms, let T be a role with T\Z*R and Trans(T). Then 

{X R , D f C (WTlX T>D f , 

unless there is an x G (X RD ) X ' and an y G A 1 ' with (x,y) G (T T ) X ' = T x and y g 
(X t ,d) t ' = iyT.D) 1 . This implies the existence of an element z G A 1 with (y, z) G T J 
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and z £ D x . Since (x, y) G T x and (y, z) G T x , transitivity of T x implies (x, z) G T J C 
= (i^) 1 '. Thus (x,z) G (i? T ) x and 2 £ (/J ir ) x ' because /J x = (7J* r ) x . This implies 
x G" i^R}.D tr ) x \ which is a contradiction because (MR) .D tr ) x ' = (Vi?.7J) x = (X RjD ) x ' and 
x G (^Cr,d) X . Summing up, (Xr^d) 1 is contained in the interpretation of every conjunct 
that appears on the right-hand side of the axioms, and hence 

j-t 

(X R , D f C ( fl VT T .X T , D ) . 

TC*_R,Trans(T) 

This holds for every X RjD G X and hence X' |= X?. Thus, we have shown that C tr is 
satisfiable w.r.t. Tq. 

For the z/-direction, let X be an interpretation with (C tr ) x 7^ and X |= Xc- From X 
we construct an interpretation X' such that C" 1 7^ and X' |= 7?.. To achieve the latter, we 
define R x ' as follows: 

f((i? T ) X ) + ifTrans(fl), 
^ = S (tffu |J 5 X ' otherwise. 

I SQ*R,S^R 

Since 7?. is cycle-free, i? 1 ' is well-defined for every R and it is obvious that, for every R 
with Trans(_R), R 1 ' is transitive. First, we check that X' indeed satisfies 1Z. 

Claim 6.25 T \= S Q R for every 

If -iTrans(i?), this is immediate from the construction. If Trans(i?), then the proof is 
more complicated. It is by induction on the number \\S\\ = fj{S" | S'Q*S,S' 7^ S}, where 
the case for Trans(S') does not make use of the induction hypothesis. 

• If 11511 = and ^Trans^), then S 1 ' = (S r f C {R^f C ((R r f) + due to Lemma 6.19 
because SQ*R. 

• If 11511 = n > and Trans(5), then (5 T ) J C (R)) x since SQ*R, and hence S 1 ' = 
((5 T ) X )+ C ((i? T ) J )+ = R 1 '. 

• If ||5|| = n > and ^Trans(5), then S 1 ' = (5 T ) J U \J S 'r*s,S'^s S ' T '■ For ever y S '^* S 
with S" 7^ S, \\S'\\ < \\S\\ because 1Z is cycle-free. Since S"jZ*i? holds by the definition 
of induction yields (S') x ' C i? x '. Also, since 5C*i?, (S T ) X C (i? T ) x and hence 
S x ' C R x '. 

Claim 6.26 If (x,y) G R 1 ' , then G (-R T ) X or there exists a role T\Z*R with Trans(T) 

and a path x , . . . , 2^ such that k > 1, x — x , y — Xk, and (xj, Xj+i) G (T^) x for < 2 < fc. 

Again, then proof is by induction on || • || and, if Trans(-R) holds, then we do not need 
to make use of the induction hypothesis. 

• If ||i?|| = and -iTrans(i?), then R 1 ' = (R^) x and thus (x,y) G R x ' implies (x,y) G 

{W) x . 
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• If \\R\\ = n > and Trans(i?), then R 1 ' = ((i? T ) J )+ If (x,y) G (i? T ) x , then we 
are done. Otherwise, there exists a path xq, . . . , with k > l,x = x ,y = y k , and 
(xi,x i+ i) G (-R T ) X Also, by definition, RC*R. 

• If ||i?|| = n > and -iTrans(i?), then either (x,y) G {R}) J or (x,y) G S* 1 ' for 
some S^*R with S ^ R. In the latter case, || J S'|| < ||i2|| and, by induction, either 
(x, y) G (5^) x C (-R^) x , or there exists a role TC*S* with Trans(T) and a path 
#0, ...,2;*; with k > l,x — Xo,y — x k , and (xi,Xi+i) G (T^) x for < i < k. Since 
TE*St*i2, also TH*R. 

Claim 6.27 For a simple role i?, i? x ' = (i? T ) x . 

The proof is by induction on the number of sub-roles of R. If R is simple and has no 
sub-roles, then R 1 ' = (R}) J holds by definition of X' . If R is simple, then every role S with 
S\=*R and S ^ R has less sub-roles than R because 1Z is cycle free, and must be simple 
because otherwise R would not be simple. Hence, the induction hypothesis is applicable 
to each such S, which yields S 1 ' = (S T ) X Also, since SQ*R, S 1 ' = (S T ) X C (i? T ) x holds 
by Lemma 6.19 and hence R x = (R^) 1 ■ 

Claim 6.28 D T = (D tr f for every D G clos(C,TZ). 

The proof is by induction on the value [•] of concepts in clos(C, 71), where the function 
[■] is defined by 



where the definition of the norm || • || of a SHZQ-concept is similar to the definition for 
ACCQ extended to universal and existential restrictions: 



The purpose of this (seemingly rather strange) definition of [•] is to reduce the case for 
an existential restriction to its dual universal restriction. Except for existential, universal, 
and number restrictions, all cases are straightforward. 

• If D = VR.E, then D tr = X RyE . If x <£ {X RyE f then, since J |= X RyE = \/R) .E tr , also 
x <£ \/R).E tr . By induction, (E tr ) x = E 1 ' and (i? T ) x C R 1 ' , and hence x <£ (VR.E) 1 ' '. 

If x G (Xr jE ) x and (x,y) G R 1 ' , then, by Claim 6.25, there are two possibilities. 

- If (x,y) G (R^) 1 , then y G (E tr ) T holds because J |= X R)E = VR).E tr . By 
induction, (E* r ) x = E 1 ' , and hence y G E 1 ' . 




2 x \\D\\ + 1 if D = 3R.E 
2 x I ID 1 1 otherwise 



||A|| 

||CinC 2 || 
\\WR.C\\ 
||(m nSC)\\ 



-nA\\ 
3R.C\\ 



for A G NC 

1 + ||C7i|| + ||C 2 
1 + IICII 

1 + I|C|| 
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- There is a role TC*i? with Trans(T) and a path x , . . . ,Xk with k > l,x — 
x ,y = Xk, and (xi,x i+1 ) G (T T ) X for < i < k. Since X |= X RE □ VT^ .X T>E 
and X |= Xt,e = VT^.Xt,e, we have Xi G (Xt^) 1 for every 1 < i < k, and 
Xk-i G (Xr^)" 1 in particular. Since X |= Xt,e = MT^ .E tr , it follows that 
y = x k G {E tr ) x and, by induction, y G 

In any case, we have shown that y G E 1 ' and, since y has been chosen arbitrarily 
with (x,y) G R T , x G (VR.E) 1 ' holds. 

• If L> = 3/?.^, then D tr = ^X R ^ E and 

= A x ' \ (Vi?.~£f = A x \ (X w f = (^X^f, 
where (yR.^E) 1 ' = (X R ^ E ) X follows by induction since \VR.~E] < \3R.E]. 

• If D = (wn R E), then L>' r = (ixm i? T and R is simple. By Claim 6.27, 
R 1 ' = (R}) x holds. Also, by induction, (E tr f = E 1 ' and hence D 1 ' = (D tr f '. 

This finishes the proof of Claim 6.28, which yields C 1 ' = (C tr ) J ^ 0. Since we have 
already shown that X' |= 71, we have proved satisfiability of C w.r.t. 7Z. ■ 



Since the reduction from Definition 6.22 is obviously polynomial in \C\ and \7Z\, Lemma 
6.23 together with Theorem 4.38 and Theorem 6.17 yield the following corollary. 

Corollary 6.29 

The following problems are ExpTlME-compIete even in the case of binary coding of num- 
bers in the input: 

• Satishability and subsumption of SHZQ-concepts w.r.t. role hierarchies. 

• Satishability and subsumption of SHLQ-concepts w.r.t. general TBoxes and role hi- 
erarchies. 

Obviously, the reduction from Definition 6.22 works also for iSHZQ-ABoxes and so, from 
Theorem 4.42, we get that also iSHZTQ-knowledge bases can be handled in ExpTime. 

Corollary 6.30 

Knowledge base satishability and instance checking for SHEQ are ExpTlME-compIete, even 
in the case of binary coding of numbers in the input. 

Finally, it is easy to see how to extend the reduction from Definition 6.22 to SHLQP, 
the extension of SHEQ with nominals. Simply set i tr = i for every individual % G Nl. Since 
SHLQP strictly contains ACCQEO, we get the following. 

Corollary 6.31 

Concept satishability satishability w.r.t. general TBoxes, and knowledge base satishability 
for SHLQP are NExpTlME-nard. The problems are NExpTlME-compIete if unary coding 
of numbers in the input is assumed. 
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Proof. The lower bound is immediate from Corollary 5.27, since ACCQLO is strictly con- 
tained in SHLQP. For the upper bound, the reduction from Definition 6.22 extended to 
SHLQP by setting i tT = i for every individual i G Nl yields a reduction from SHEQP to 
ACCQTO, for which the corresponding problems are solvable in NExpTime if unary coding 
of numbers in the input is assumed. ■ 



6.3 Practical Reasoning for SHLQ 

The previous ExpTlME-completeness results for SHLQ rely on the highly inefficient au- 
tomata construction of Definition 4.34 used to prove Theorem 4.38 and, in the case of 
knowledge base reasoning, also on the wasteful pre-completion technique used to prove 
Theorem 4.42. Thus, we cannot expect to obtain an implementation from these algo- 
rithms that exhibits acceptable runtimes even on relatively "easy" instances. This, of 
course, is a prerequisite for using SHLQ in real-world applications. 

For less expressive DLs, some of the implementations of reasoners that perform fastest 
in system comparisons (Massacci & Donini, 2000) are based on tableau calculi similar to the 
ones we have already studied in this thesis. Among them are FaCT (Horrocks, 1998) for the 
DL SHF, RACE (Haarslev & Moller, 1999) for SHAf, and DLP (Patel-Schneider, 2000) for 
an extension of ACC rcg with number restrictions. The efficiency of these implementations is 
due to a number of optimizations (Baader, Franconi, Hollunder, Nebel, & Profitlich, 1994; 
Horrocks, 1997; Horrocks & Patel-Schneider, 1999; Horrocks & Tobies, 2000; Haarslev & 
Moller, 2000c) for which tableau algorithms proved to be amenable. 

To make these optimizations applicable and to allow for an easy extension of existing 
implementations to SHLQ, we develop a tableau algorithm that decides concept satisfia- 
bility for SHLQ. By Theorem 6.17, such an algorithm can also be used to decide concept 
satisfiability w.r.t. general TBoxes. This algorithm can be seen as the culmination point of 
the development of tableau-based decision procedures for more and more expressive DLs. 
To mention only the more recent ones: Sattler (1996a) describes an algorithm for S that 
is subsequently extended to deal with role hierarchies [SH) by Horrocks (1998). Haarslev 
and Moller (2000a) add number restrictions (SHAf) while Horrocks and Sattler (1999) add 
inverse roles and functional restrictions (SHLF). Here, we extend the latter algorithm to 
deal with qualifying number restrictions to obtain a tableau based decision procedure for 
SHLQ. 

Many techniques required for this extension are already present in the SHLF- algorithm 
(Horrocks & Sattler, 1999) and in the ^4/jCQ-algorithm presented in Chapter 4. In addition 
to these techniques, we develop a novel way to construct a model from a completion tree 
to prove soundness of the iSHZQ-algorithm. This is necessary because SHLQ no longer has 
the finite model property. 
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6.3.1 A SHLQr Tableau 

For the tableau algorithm, it will be helpful to have a syntactic satisfiability criterion for 
satisfiability that deals with the extra complexity caused by transitive roles, similar to the 
tableau for SI defined in Definition 6.2. The iSHZQ-algorithm will then search for SHEQ- 
tableaux rather than for models. Like for SI, elements of a iSHTQ-tableau are labelled 
with sets of "relevant" concepts. Due to the presence of qualifying number restrictions, 
not only the sub-concepts of the input concepts are of relevance but also their negations 
(in NNF). Also, propagation of universal restrictions along transitive roles is slightly more 
complicated in the presence of a role hierarchy compared to the case of SI, and may involve 
universal restrictions that are not present as sub-concepts of the input concept. Hence, 
elements of the tableau are labelled not only with sub-concepts of the input concept but 
rather from the larger set clos(C,1Z) that is defined in Definition 6.20. 

Based on this set, the definition of a tableau for ShdQ is now similar to the one for SI 
in Definition 6.2. 

Definition 6.32 (A Tableau for ShdQ) 

Let C be a ShdQ- concept in NNF, 1Z a role hierarchy, and NR<^ ^ the set of roles occurring 
in C, 1Z, together with their inverses. A tableau T for C w.r.t. 1Z is a triple (S, C, £) such 
that S is a non-empty set, C : S — > 2 clos ( C ' n ^ maps each element to a subset of clos(C, 7Z), 
£ : NRc 5 -fc — > 2 s xS maps each role in NRc^ to a set of pairs of individuals, and the following 
conditions are satisfied: 

(Tl) There is an s G S with C G C(s), and 

for all s,teS, A, d, d, D G clos(C, 11), and R, S G NR C \n, 

(T2) if A G C(s), then ->A g" £{s) for A G NC, 

(T3) ifd HC 2 e £{s), then d G C(s) and C 2 G £{s), 

(T4) ifd Ude £{s), then d G £{s) or d G C(s), 

(T5) if3R.D G C(s), then there is some t G S such that (s,t) G E(R) and D G C(t), 
(T6) ifVR.D G C(s) and (s,t) G £(#), then D G £{t), 

(T7) ifVR.D G £(s), (s,t) G £(T) for some T\Z*R with Trans(T), then VT.D G £(t), 

(T8) (s,t) G £(i?) iff(t,s) G £(lnv( J R)). 

(T9) if(s,t) G £(S) and S^*R, then (s,t) G E(R), 
(T10) if(>n S D)e C(s), then $S r (s,D) > n, 
(Til) if (<n S D) G C(s), then $S r (s,D) < n, 

(T12) if(txn S D)e C(s) and (s,t) G E(S), then D G C(t) or ~D G C{t), 
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where we use \x as a placeholder for both < and > and we define 

S T {s,D) := {t G S | (s,t) G E(S) and D G C(t)}. 
The existence of a tableau is a necessary and sufficient criterion for satisfiability: 
Lemma 6.33 

A SHLQ-concept C is satishable with respect to a role hierarchy TZ iff there exists a tableau 
for C with respect to TZ. 

Proof. For the ^/-direction, the construction of a model of C from a tableau for C is 
similar to the one presented in the proof of Lemma 6.3 where the interpretation of the 
roles is defined in the same manner as it was done in the proof of Lemma 6.23. To be more 
precise, if T = (S, C, £) is a tableau for C w.r.t. TZ and C G C(s ), a model X = (A J , • x ) 
of C can be defined as follows: 

A 1 = S 

A 1 = {s | A G C(s)} for all concept names A in clos(C,TZ) 

( £(R)+ ifTrans(i?) 
R 1 = I E(R)U \J S 1 otherwise 

Like in the proof of Lemma 6.23, it is easy to see that 1 \= TZ and that (s,t) G R 1 iff 
(s, t) G £(-R) or there exists a role TQ*R with Trans(T) and a path s , . . . , Sk such that 
k > 1, s — so, t — Sk, and (sj, s«+i) G £(T) for < i < k. Moreover, if R is simple, then 
R 1 = E(R). 

It remains to show that C 1 ^ 0. This is done by proving that D G C(s) implies s G D 1 
for each D G clos(C,TZ) and s E S. Since C G £(so), w e then have s £ C 1 and hence 
X is a model of C . The proof is by induction on the norm || • || of concepts as defined in 
the proof of Lemma 6.23. The two base cases of the induction are D = A or D = ->A for 
A G NC. If A G C(s), then, by definition, s G A 1 . If ->A G C(s), then, by (T2), A g" £(s) 
and hence s G" A x . For the induction step, we have to distinguish several cases: 

• The cases D = C\ n C%, D = C\ U C2, and D = 3R.E are exactly as for SI in the 
proof of Lemma 6.3 

• D = \/R.E. Let s G S with D G £(s), let £ G S be an arbitrary individual such that 
(s, t) G R 1 . There are two possibilities: 

- (s,t) G E(R). Then (T6) implies E G and, by induction, t G E 1 . 

— (s, t) G" E(R). Due to (T8), this can only be the case if there is a role T\=*R with 
Trans(T) and a path (s, si), (si, s 2 ), . . . , (sfc-i, t) G £(T) with A; > 1. Then (T7) 
implies VT._E G £(s«) for all 1 < i < — 1 and particularly VT.i? G £(sfe_i). 
Due to (T6), E G £(£) also holds. Again, by induction, this implies t G 
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In both cases, we have t G E 1 and, since t has been chosen arbitrarily, s G D 1 holds. 

• D = (>n S E). For an s with D G C(s), we have §S r (s, E) >nhy (T10). Hence 
there are n individuals ti,...,t n such that U ^ tj for i ^ j, (s,ti) G E(S), and 
E G C(ti) for all %. By induction, we have U G E 1 and, since E(S) C S* 2 -, also s G -D x . 

• D = (<n S E). For this case, it is crucial that 5 is a simple role because this 
implies S 1 = E(S). Let s be an individual with D G £(s). Due to (T12), we have 
E G C(t) or ~£ G for each t with (s,t) G £(5). Moreover, H>S r (s,£) < n holds 
due to (Til). We show that ^S T (s, E) < $S T (s,E): assume ^S T (s,E) > $S T (s,E). 
This implies the existence of some t with (s,t) G S 1 and t e E x but _E G" 
(because S* 1 = E(S)). By (T12), this implies ~E G >C(t), which, by induction, yields 
t G (^E) 1 , in contradiction to t G -E x . 

For the on/y-if-direction, we have to show that satisfiability of C w.r.t. 1Z implies the 
existence of a tableau T for C w.r.t. 1Z. 

Let X = (A x , ■ x ) be a model of C with I \= 1Z. A tableau T = (S, C, E) for C can be 
defined by: 

S = A 1 , 
E(R) = R 1 , 

C(s) = {D G clos(C,n) | s G £> x }. 
It remains to demonstrate that T is a tableau for D: 

• Except for (T7) and (T9), all conditions are satisfied as a direct consequence of the 
definition of the semantics of l SHTQ-concepts. 

• For (T7), if s G {MR.D f and (s,t) G T 1 for T with Trans(T) and TC*i?, then 
t G (iT.D) 1 unless there is some u such that (t, u) G T x and u G" -D x . In this 
case, since (s, t) G T 1 , (t,u) G T 1 , and Trans(T), it holds that (s,u) G T x . Hence 
(s,u) G -R x and s G" (yR.D) 1 — in contradiction to the assumption. T therefore 
satisfies (T7). 

• Condition (T9) is satisfied because X \= 1Z and set-inclusion is a transitive property. 



6.3.2 A Tableau Algorithm for SKLQ 

In the following, we present an algorithm that, given a l SHTQ-concept C and a role hierarchy 
decides the existence of a tableau for C w.r.t. As before, we assume that 1Z is cycle- 
free. Like the iSX-algorithm, the l SHTQ-algorithm works on a finite completion tree, and 
employs a blocking technique to guarantee termination. 
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Figure 6.5 A tableau where pair-wise blocking is crucial 



x o L(ar) = {-iA, (<1 F),3F~ .D,VR~ .{3F~ .D)} 
F~ 

y A L(y) = {D, 3F~.D, VR-.(3F~.D), A, (<1 F), 3F.^A} 
y blocks z [ F~ 

z 5 L(z) = {D,3F-.D,Vi2-.(3F-.D),A,(<l F),3F.^4} 



Pair-wise Blocking 

From the fact that SHEQ no longer has the finite model property, it is immediately clear 
that the tableau construction we have employed for SI will not work without modification 
for SHZQ, as this technique always resulted in finite tableaux and hence in finite models. 

Horrocks and Sattler (1999) show that for the fragment SHET of SHEQ, dynamic block- 
ing no longer is sufficient and describe the pair-wise blocking technique that can successfully 
be applied also for SHEQ: if a path contains two pairs of successive nodes that have pair- 
wise identical labels and whose connecting edges have identical labels, then the path beyond 
the second pair is no longer expanded — it is blocked. Blocked paths are then "unraveled" 
to construct an infinite tableau from a finite completion tree. The identical labels make 
sure that copies of the blocking node and its descendants can be substituted for the blocked 
node and its respective descendants. Note the similarity between this pair-wise blocking 
condition and the condition imposed by the combination of the blocking condition and the 
cut rule by De Giacomo and Massacci (2001) for CPDL. 

Figure 6.5 shows that pair-wise blocking is crucial in order to ensure that the algorithm 
discovers the unsatisfiability of the concept 

->A n (<1 F) n 3F-\D n V J R" 1 .(3F _1 . J D), 

where Trans(F), F □ R, and D represents the concept 

An (<1 F) n 3F.-.A 

Using dynamic blocking, z would be blocked by y. The resulting tree cannot represent 
a cyclic model in which y is related to itself by an F" 1 role as this would conflict with 
(<1 F) G L(y). The tree must therefore represent the infinite model generated by recur- 
sively replacing each occurrence of z with a copy of the tree rooted at y. However, this also 
does not lead to a valid model, since, if z is substituted by a copy of y, then the constraint 
3F.-iA G L(y), which was satisfied because of ->A G L(x), is no longer satisfied in its new 
location. 

When pair-wise blocking is used, z is no longer blocked by y as the labels of their 
predecessors (y and x respectively) are not equal, and the algorithm continues to expand 
h(z). The expansion of 3F.->A G L(z) calls for the existence of a node whose label includes 
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-i A and that is connected to z by an F-labelled edge. Because of (<1 F) G L(z), this node 
must be y, and this results in a contradiction as both A and ->A will be in L(y). 

To extend the iSHZjF-algorithm from (Horrocks & Sattler, 1999) to SHZQ, we add rules 
that deal with qualifying number restrictions, similar to the ones used by the standard 
algorithm for ACCQ (Algorithm 4.4), namely a — >>-rule that introduces new successor nodes 
to satisfy >-restrictions, a ^<-rule that identifies nodes as required by <-restrictions, and 
a — > c hoose -ru l e that makes sure that all relevant concepts at a node are either positively or 
negatively asserted for that node (refer to (T12)). 

In order to guarantee the termination of the algorithm, we have to make sure that the 
— >>- and ^<-rules cannot be applied in a way that would yield an infinite sequence of 
rule applications, generating and identifying successors indefinitely. This is enforced by 
recording in a relation "7^" which nodes have been introduced by an application of the 
^>-rule and by prohibiting the identification of these nodes by the — ><-rule. 

Algorithm 6.34 (The 5HTQ-algorithm) 

Let C be a SHZQ-concept in NNF to he tested for satishability w.r.t. a role hierarchy 
1Z and NRc\n the set of roles that occur in C and 1Z together with their inverses. A 
completion tree T = (V, E, L) is a labelled tree in which each node x 6 V is labelled with 
a set L(x) C clos(C,lZ) and each edge (x,y) G E is labelled with a set L(x,y) C NR C ^. 
The algorithm expands the tree by extending L(x) for some node x, or by adding new 
leaf nodes. Additionally, we keep track of inequalities between nodes of the tree with a 
symmetric binary relation 7^ between nodes in V. 

Given a completion tree, a node y is called an i?-successor of a node x ify is a successor 
of x and S G L(x,y) for some S with S^*R; y is called an i?-neighbour of x if it is an 
R-successor of x, or if x is an lnv(i?) -successor of y. 

For a role R, a concept D, and a node x G V, we define R T (x, D) by 

R T (x, D) — {y I y is an R-neighbour of x and D G L (?/)}. 

A node x is directly blocked if none of its ancestors is blocked, and it has ancestors x' , 
y, and y' , such that 

1. x is a successor of x' and y is a successor of y' , and 

2. L(x) = L(y) and L(x') = L(y') and 

3. L(x',x) = L(y',y). 

In this case we will say that y blocks x. 

A node is indirectly blocked if its predecessor is directly or indirectly blocked, and in 
order to avoid wasted expansion after an application of the ^<-rule, a node y will also be 
taken to be indirectly blocked ify is a successor of a node x and L(x,y) = 0. 

For a node x, L(x) contains a clash if, for some concept name A G NC, {A, ~<A} C L(x), 
or if, for a some concept D, some role S, and some n G N: (<n S D) G L(x) and there 
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are n + 1 nodes yo, . . . ,y n such that D e L(yj), j/j is an S-neighbour of x, and y-i ^ yj for 
all < i < j < n. 

The algorithm initializes the tree T to contain a single node Xq, called the root node, 
with L(xo) = {C}. The inequality relation ^ is initialized with the empty set. T is then 
expanded by repeatedly applying the rules from Figure 6.6. 

The completion tree is complete if, for some node x, L(x) contains a clash, or if none 
of the rules is applicable. If, for an input concept C , the expansion rules can be applied 
in such a way that they yield a complete, clash-free completion tree, then the algorithm 
returns "C is satishable" , and "C is unsatishable" otherwise. o 

For a discussion of the different kinds on non-determinism present in the l SHTQ-algorithm, 
compare below Algorithm 6.4. 

Like for SL, the definition of successor and predecessor reflects the relative position 
of two nodes in the completion tree: if x is an /^-successor of y then this implies that 
(x, y) G E and it is not the case that y is an lnv(i?)-successor of x. It is necessary to make 
this pedantic distinction because when we construct a tableau from a complete and clash- 
free completion tree in the proof of Lemma 6.38, a blocked successor is replaced by a copy 
of the sub-completion tree consisting of the respective blocking node and its descendants. 
This makes the distinction between an i?-successors and an lnv(i?)- predecessors significant 
has has to be reflected in the completion rules. 

Note that the definition of blocking is recursive because the status of a node depends, 
among other things, on the status of its predecessor. Since the dependency is on the 
predecessor and the ancestors, one can determine the status every node starting at the 
root, which has no predecessor or ancestor and hence is never blocked. Once the blocking 
status of a node has been determined, one can then determine the status of its successors. 

Since we only block along a path in the completion tree, for every directly blocked node 
there is a uniquely determined blocking node. Assume there would be a directly blocked 
node x and two distinct unblocked nodes 2/1,2/2 blocking x. Since both 2/1 and 2/2 must be 
ancestors of x, w.o.l.g., 2/1 is an ancestor of 2/2- Yet, this implies that 2/1 directly blocks 2/2 
and x cannot be directly blocked because it has the blocked ancestor 2/2, a contradiction. 

Before we prove the correctness of the iSHTQ-algorithm, we discuss the intuition be- 
hind the expansion rules and their correspondence to the constructors of SHLQ. Roughly 
speaking, 4 the completion tree is a partial description of a model whose individuals corre- 
spond to nodes and whose interpretation of concept and role names is determined by the 
node and edge labels. Since the completion tree is a tree, this would not yield a correct 
interpretation of transitive roles, and thus the interpretation of transitive roles is built via 
the transitive closure of the relations induced by the corresponding edge labels. 

The 

- ~~ ►lt, ~ and ^y-rules are the standard tableau rules for ACC from Al- 
gorithm 3.2, with the exception that we limit the applicability of the — >y- and ^v+- ru l e 
to those nodes that are not blocked or directly blocked. The ^ v+ - r ule is similar to the 



4 For the following considerations, we employ a simpler view of the correspondence between completion 
trees and models, and do not bother with the unraveling construction mentioned above. 
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Figure 6.6 Tableau expansion rules for SKEQ 

if 1. Ci n C 2 G L(x) and 

2. {C u C 2 }%L(x) 
then L(x) -> n L(a:) U {Ci, C 2 } 

if 1. Ci U C 2 G L(x) and 
2. {d,C 2 }nL(x) =0 
then L(x) ^ u L(x) U {E} for some £ G {C u C 2 } 

^y: if 1. WR.D G L(x), x is not indirectly blocked, and 
2. there is an i?-neighbour y of x with Z) G" L(y) 
thenL(y) -+ v L(y)U{L>} 

^v+: if 1- V-R.-D G L(x), x is not indirectly blocked, and 

2. there is some T with Trans(T) and TH*R 

3. there is a T- neighbour y of x with VT.ZJ G" L(y) 
then L(y) ^ v+ L(y) U {VT.D} 

if 1. 3R.D G L(x), x is not blocked 

2. x has no /^-neighbour y with D G L(y) 
then create a new successor y of x with L(x, y) = {i?} and L(y) = {£>} 

^choos C : if 1. (cxi n S D) e L(x), x is not indirectly blocked, and 

2. there is an ^-neighbour y of x with {D, ~D} n L(y) = 
then L(y) ^ choosc L(y) U {E} for some £ G {D, ~L>} 

^>: if 1. (>n S D) G L(x), x is not blocked, and 

2. there are not n S- neighbours y±, . . . , y n of x with 
£> G L(yj) and y, 7^ y, for 1 < i < j < n 
then create n new successors y±, . . . , y n of x with L(x, y«) = {S*}, 
L(yi) = {£>}, and y^ 7^ yj for 1 < i < j < n. 

^<: if 1. (<n S D) E L(x) with n > 1, x is not indirectly blocked, and 
2. (15' T (x, D) > n and there are two S-neighbours y, z of x with 
.D G L(y), -D G L(z), y is a successor of x, and not y ^ z 
then 1. L(z) ^< L(z) U L(y) and 

2. if z is a predecessor of x 

then L(z,x) ^< L(z, x) U lnv(L(x, y)) 
else L(x, z) ^< L(x, z) U L(x, y) 

3. L(x,y) ^<0 

4. Set -u 7^ 2; for all w with u 7^ y 



^v+- ru le f° r «ST without refined blocking extended to deal with role-hierarchies as follows. 
Assume a situation that satisfies the precondition of the — >v + -rule, i.e., WR.D G L(x), 
and there is a T-neighbour y of x with Trans(T), TC*_R, and \tR.D G" L(y). If y has a 
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T-neighbour z, then, due to the transitivity of T, x and z are also related via T. Since 
T|Z*_R, it is also an /^-neighbour of x and hence must satisfy D. This is ensured by adding 
\/T.D to L(y), which, in turn, causes D to be added to L(z). 

The rules dealing with qualifying number restrictions work similarly to the rules of 
the standard algorithm for ACCQ (Algorithm 4.4). For a concept (>n S D) e L(x), the 
^>-rule generates n S-successors yi, ■ ■ ■ ,y n of x with D e L(yj). To prevent the — ><-rule 
from indentifying these new nodes, it also sets ^ yj for each 1 < % < j < n . Conversely, 
if (<n S D) G L(x) and x has more than n S-neighbours that are labelled with .D, then 
the — ><-rule chooses two of them, say y, z, that are not required to be distinct by ^ and 
merges them, together with the edges connecting them with x. The algorithm constructs 
a completion tree so at least one of y, z must be a successor of x. Let this be y. If z is a 
predecessor of x in the completion tree, then it is necessary that we join y onto z and not 
vice vesus because otherwise x would become detached in the completion tree. 

The definition of a clash takes care of the situation where the ^ relation makes it 
impossible to merge any two S'-neighbours of x, while the — > c h 00 se- r ule ensures that all 
S'-neighbours of x are labelled with either D or ~D. The relation ^ is used to prevent 
infinite sequences of rule applications for contradicting number restrictions of the form 
(>n S D) and (<m S D), with n > m. 

Labeling edges with sets of roles allows a single node to be both an S- and i?-successor 
of x even if S and R are not comparable with respect to An example for a concept 
that enforces such a situation is (>2 Si A) n (>2 S 2 A) n (<3 R A) with Si C R, which 
enforces a successor reachable both via Si and S 2 . 

We will now prove correctness of the tableau algorithm in a manner similar to the one 
for the iST-algorithm. 

Termination 

Like for SI, termination of the algorithm is ensured by blocking, which prevents the creation 
of unbounded paths in the completion tree. 

Lemma 6.35 (Termination) 

For each SHLQ-concept C and role hierarchy 1Z, the tableau algorithm terminates. 

Proof. Let m = §clos(C,lZ), k = (jNRc,^, and n max the maximum n that occurs in 
a concept of the form (cxi n S D) e clos(C,lZ). Termination is a consequence of the 
following properties of the expansion rules: 

• The expansion rules never remove nodes from the tree or concepts from node labels. 
Edge labels can only be changed by the — ><-rule which either expands them or sets 
them to 0; in the latter case, the node below the 0-labelled edge is blocked and this 
block is never broken. 

• Each successor of a node x is the result of the application of the — * 3 -rule or the 
^>-rule to x. (Note that the — ><-rule, does not move nodes in the tree.) For a node 
x, each concept in L(x) can trigger the generation of successors at most once. 



130 



Chapter 6. Transitive Roles and Role Hierarchies 



For the — >3-rule, if a successor y of x was generated for a concept 3R.D G L(x) and 
later L(x, y) is set to by the — ><-rule, then there is some i?-neighbour z of x with 
£> G L(z). 

For the — >>-rule, if y±, . . . ,y n were generated by the — >>-rule for (>n S D) G L(x), 
then ?/j =^ y, holds for all 1 < « < j < n. This implies that there are always n 
S- neighbours y[, . . . , y' n of x with D G L(y^) and y^ 7^ y^ for all 1 < 2 < j < n, since 
the — ><-rule never merges two nodes y[, y[- with y[ ^ y'- and, whenever an application 
of the — ><-rule sets L(x, to 0, there is some ^-neighbour z of 1 which "inherits" 
both D and all inequalities from y[. 

Since clos(C, TV) contains a total of at most m concept of the form 3R.D and 
(>n S D), the out-degree of the tree is bounded by m • n max . 

• Nodes are labelled with non-empty subsets of clos(C,1Z) and edges with subsets of 
NRc,7£) s° there are at most 2 2mk different possible labellings for a pair of nodes and 
an edge. Therefore, if a path is of length > 2 2mk , then, from the pair- wise blocking 
condition, there must be two nodes x, y on this path such that x is directly blocked 
by y. 

Since a path on which nodes are blocked cannot become longer, paths are of length 
at most 2 2mk . ■ 



Completeness 

To prove completeness of the SHZQ-algorithm, we proceed as for the ST-algorithm and 
guide the application of the non-deterministic — ^choose-; and -^<-rule using a function 
that maps nodes of the completion tree to elements of a tableau. 

Lemma 6.36 

Let C be a SHZQ-concept in NNF. If C has a tableau w.r.t. 1Z, then the expansion rules 
can be applied in such a way that the tableau algorithm yields a complete and clash-free 
completion tree for C w.r.t. 1Z. 

Proof. Let T = (S, C, £) be a tableau for C w.r.t. 1Z. We use this tableau to guide the 
application of the non-deterministic rules. To do this, we will inductively define a function 
7r, mapping the nodes V of the tree T to S such that, for each x, y E V: 

L(x) C C(tt(x)) } 
if y is an /^-neighbour of x, then (n(x), n(y)) G £(-R) > (*) 

x 7^ y implies n(x) 7^ n(y) J 

Claim 6.37 Let T be a completion-tree and n a function that satisfies (*). If a rule is 
applicable to T, then the rule is applicable to T in a way that yields a completion-tree T' 
and an extension of tc that satisfies (*). 
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Let T be a completion-tree and it a function that satisfies (*). We have to consider the 
various rules. 

• For the — > n -, — > u -, and ^ 3 -rule, this is analogous to the proof of Lemma 6.8 for 
SE. 

• The ^y-rule: If \/R.D G L(x), then VR.D G £(ir(x)), and if y is an _R-neighbour 
of x, then also (7r(x), 7r(y)) G £(-R) due to (*). (T6) implies D G L(7r(y)) and hence 
the — >v-mle can be applied without violating (*). 

• The ^ v+ -rule: If MR.D G L(x), then \/R.D G C(ir(x)), and if there is some TH*R 
with Trans(T) and y is an T-neighbour of x, then also (ir(x),7i(y)) G £(T) due to 
(*). (T7) implies WT.D G C{n{y)) and hence the — >v + -rule can be applied without 
violating (*). 

• The ^ c hoose-rule: If (cxm S D) G L(x), then (cxin S D) E C(tt(x)), and, if there is 
an ^-neighbour y of x, then (vr(x), 7r(y)) G £(5') due to (*). (T12) implies {.D, ~.D}n 
C(ir(y) 7^ 0. Hence the — * c h 00 se- r ule can add an appropriate concept E G {D, ~D} 
to L(x) such that L(y) C C(n(y)) holds. 

• The ^>-rule: If (>n S D) e L(x), then (>n S D) e C(ir(x)) and (T10) implies 

(jS' r (7r(a;), £)) > n. Hence there are elements ti,...,t n G 5 such that (7r(x),ij) G 
£(S'), -D G and tj 7^ ^ for 1 < « < j < n. The ^>-rule generates n new nodes 

yi, . . . ,y n . By extending 7r := 7r[yi 1— > ti, • • • y n 1— > t n ], one obtains a function 7r' that 
satisfies (*) for the extended tree. 

• The ^<-rule: If (<n S D) G L(x), then (<n S D) £ C(tt(x)) and (Til) implies 
tJ<S^"(7r(a;), D) < n. If the ^<-rule is applicable, we have §S T (x,D) > n, which 
implies that there are at least n + 1 ^-neighbours yo, . . . , y n of x such that D G L(yj). 
Thus, there must be two nodes y,z G {yo, . . . ,y n } such that n(y) = ir(z) (because 
otherwise §S t (tt(x),D) > n would hold). Since ir(y) = ir(z), we have that y ^ z 
cannot hold because of (*), and y, z can be chosen such that y is a successor of x 
because x has at most one predecessor. Hence the — ><-rule can be applied without 
violating (*). 

Why does this claim yield the completeness of the tableau algorithm? For the initial 
completion-tree consisting of a single node xo with L(rro) = {C} and 7^ = 0, the function 
71 = [xq 1— > So] for some sq G S with C G L(so) satisfies (*). Such an sq exists due to 
(Tl). Whenever a rule is applicable to T, it can be applied in a way that maintains (*), 
and, since the algorithm terminates, we have that any sequence of rule applications must 
terminate. Property (*) implies that any tree T generated by these rule-applications must 
be clash-free as there are only two possibilities for a clash, and it is easy to see that neither 
of these can hold in T: 

• T cannot contain a node x such that {A, -iA} G L(x) because L(x) C C{tt{x)) and 
hence (T2) would be violated for n(x). 
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• T cannot contain a node x with (<n S D) e L(x) and n + 1 S'-neighbours yo, ■ ■ ■ y n 
of x with .D e L(yj) and 7^ ?/j for < i < j < n because (<n S D) e £(7r(x)), and, 
since Vij^Vj implies 7r(j/j) 7^ ^(yj), §S t (tt(x),D) > n, in contradiction to (Til). ■ 

Soundness 

Due to the presence of qualifying number restrictions and the lack of the finite model 
property, the construction of a tableau from a complete and clash-free completion tree is 
much more involved than this has been the case for ST where, in case of a block, it was 
possible to generate a cyclic model. Here, the completion tree is unraveled into an infinite 
tree by successively substituting a blocked node by the subtree rooted at the blocking node. 
The presence of qualifying number restrictions makes it necessary, in case of a blocking 
situation, to record the pair of blocking and blocked node (see the case for (T10) in the 
proof below). 

Lemma 6.38 (Soundness) 

If the SHZQ- algorithm generates a complete and clash-free completion tree for a concept 
C and a role hierarchy 7Z, then C has a tableau, w.r.t. 7Z. 

Proof. Let T = (V, E, L) be a complete and clash-free completion tree. A path is a 
sequence of pairs of nodes of T of the form p = \^r, . . . , 1. We define auxiliary functions 

x X n 

Tail, Tail' by setting, for such a path p, Tail(p) = x n and Tail'(p) = x' n . With [p] 2 ^-] we 
denote the path ^ ±1 ]. The set Paths(T) is defined inductively as follows: 

lx X n x n+l 

• For the root node x of T, g] e Paths(T), and 

• For a path p e Paths(T) and a node z in T: 

- if z is a successor of Tail(p) and z is not blocked, then e Paths(T), or 

— if, for some node y in T, y is a successor of Tail(p) and z blocks y, then [p\-\ € 
Paths(T). 

Please note that, due to the construction of Paths, for p e Paths(T) with p = 
we have that x is not blocked, x' is blocked iff x ^ x', and x' is never indirectly blocked — 
it is either directly blocked or unblocked. Furthermore, the blocking condition implies 
L(x) = L(x'). 

Now we can define a tableau T = (S, C, £) with: 

5 = Paths(T), 
£(p)=L(Tail(p)), 

£(i?) = {{p, q) € S x S I Either q = and x' is an i?-successor of Tail(p), 

or p = [q\^j] and x' is an lnv(i?)-successor of Tail (?)}. 

Claim 6.39 T is a tableau for C w.r.t. TZ. 
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We show that T satisfies all the properties from Definition 6.32. 

• C G C[^} because C G L(x ), hence (Tl) holds. 

• (T2) holds because T is clash-free. 

• (T3), (T4) hold because T is complete. 

• (T5): assume 3R.D G C(p) and let x = Tail(p). In T, there is an i?-neighbour y of 
x with D G L(y) because x is not blocked and the — >g-rule is not applicable. There 
are two possibilities: 

— y is a successor of x in T. If y is not blocked, then q := [p||] G S and (p, q) G 
£(-R) as well as D G C(q). If y is blocked by some node z in T, then q := \p\^\ G 
<S, (p,q) G £(-R) and, since C(q) = L(z) = L(y), D G 

— y is a predecessor of x. Again, there are two possibilities: 

* p is of the form p = [q\-] with Tail(g) = y. 

* p is of the form p = [q\^j] with Tail (9) = u 7^ y. Since x only has one 
predecessor in T, the node u cannot be the predecessor of x. Then it 
must be the predecessor of x' in T, x' 7^ x, and x blocks x', all due to 
the construction of Paths(T). Together with the definition of the blocking 
condition, this implies L(w,x') = h(y,x) as well as L(u) = L(y), due to the 
pair-wise blocking condition. 

In both cases, (p,q) G £(-R) and D G £(<?)• 

• (T6): assume VR.D G and (p,q) G £(/?). If q = \p\§], then x' is an R- 
successor of Tail(p), and thus D G L(x') because the — »v-rule is not applicable. Since 
C(q) = L(x) = L(x'), we have D G C(q). If p — [q\^t], then x' is an lnv(i?)-successor 
of Tail(g), Tail(g) an /^-neighbour of x' and thus D G C(q) = L(Tail(g)) because x' is 
not indirectly blocked and the — >v-rule is not applicable. 

• (T7): assume VR.D G £{p) and (p,q) G £(T) for some TH*R with Trans(T). If 
q = [p\§r], then x' is a T-successor of Tail(p) and thus VT..D G L(x') because otherwise 
the — »v + -rule would be applicable. From C(q) = L(x) = L(x'), it follows that 
VT..D G C(q). If p = [q\^r], then x' is an lnv(i?)-successor of Tail(g), and hence 
Tail(g) is a T-neighbour of x'. Because x' is not indirectly blocked, this implies 
VT.D G C(q) = L(Tail(g)). 

• (T12): assume (cxin S D) G C{p) and (p, q) G £(5*). If q = [p|4], then x' is 
an S'-successor of Tail(p) and thus {D, ~.D} D L(x') 7^ because the — » c hoosc-rule 
is not applicable. Since C(q) = L(x) = L(x'), we have {D, ~D} n C(q) 7^ 0. If 
p = [q\^j\i then x' is an I nv (^-successor of Tail(g), Tail(g) is an S-neighbour of x', 
and thus {D, ~D} fl C(q) = L(Tail(g)) 7^ because x' is not indirectly blocked and 
the — > c hoose-rule is not applicable. 
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• (T8) is satisfied due to the symmetric definition of £. 

• (T9) is satisfied due to the definition of /^-successors that takes into account the role 
hierarchy 

• (T10): assume (>n S D) G C(p). Completeness of T implies that there exist n 
distinct individuals yi, . . . , y n in T such that each yi is an S'-neighbour of Tail(p) and 
D G L(i/j). We claim that, for each of these individuals, there is a path q { such that 
{.P) 1i) £ £(5), D G C(qi), and q^ ^ qj for all 1 < i < j < n. Obviously, this implies 
§S T (p, D) > n. For each y iy there are three possibilities: 

— yi is an ^-successor of x and yi is not blocked in T. Then qi = [p\^] is a path 
with the desired properties. 

— yi is an ^-successor of x and yi is blocked in T by some node z. Then qi = [p\^-] 
is the path with the desired properties. Since the same z may block several of 
the yjS, it is indeed necessary to include the blocking nodes explicitly into the 
path construction to make these paths distinguishable. 

— x is an lnv(S)-successor of yi. Since T is a tree, there may be at most one such 
yi. This implies that p is of the form p = [q\^j] with Tail(g) = yi. The path q 
has the desired properties and, obviously, q is distinct from all other paths qj. 

• Assume (Til) is violated. Hence there is some p G S with (<n S D) G C{p) and 
$S r (p,D) > n. We show that this implies $S T (Ja\\(p), D) > n, in contradiction to 
either clash-freeness or completeness of T. Define x = Tail(p) and P = S T (p,D). 
Due to the assumption, we have JjP > n. We distinguish two cases: 

— P contains only paths of the form q = \p\^\. We claim that the function Tail' 
is injective on P. Assume that there are two paths qi, qi G P with qi ^ q 2 and 
Tail'(gx) = Tail'(g 2 ) = y' . Then g x is of the form qi = [p\y] and q 2 is of the 
form q 2 = [p\y] with y\ ^ y 2 . If y' is not blocked in T, then y\ — y' — y2, 
contradicting qi ^ q 2 . If y' is blocked in T, then both y\ and y 2 block y', which 
implies yi = y 2 , again a contradiction. 

Since Tail' is injective on P, it holds that fjP = (jTail'(P). Also for each y' G 
Tail'(P), y' is an ^-successor of x, and D G L(y'). This implies §S T (x,D) > n. 

— P contains a path q where p is of the form p = [q\^j\- Obviously, P may only 
contain one such path. As in the previous case, Tail' is an injective function on 
the set P' := P\{g}, each y' G Tail'(P') is an ^-successor of x and D G L(y') for 
each y' G Tail'(P'). To show that indeed §S T (x, D) > n holds, we have to prove 
the existence of a further S'-neighbour wofi with D G L(-u) and u G" Tail'(P'). 
We distinguish two cases: 

* x = x'. Hence x is not blocked. This implies that x is an lnv(S')-successor 
of z in T. Since Tail'(P') contains only successors of x, we have that z G" 
Tail'(P') and, by construction, z is an S'-neighbour of x with C G L(^). 
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* x 7^ x' . This implies that x' is blocked in T by x and that x' is an Inv(S')- 
successor of z in T. The definition of pair-wise blocking implies that x is 
an lnv(S')-successor of some node u in T with L(-u) = L(z). Again, since 
Tail'(P') contains only successors of x, we have that u £ Tail'(P') and, by 
construction, u is an ^-neighbour of x with D G L(u). ■ 

By showing termination (Lemma 6.35), soundness (Lemma 6.36), and completeness 
(Lemma 6.38) of the SHZQ-algorithm, we have established its correctness: 

Theorem 6.40 

The SHLQ- algorithm is a non- deterministic decision procedure for satisfiability and sub- 
sumption of SHLQ-concepts w.r.t. a role hierarchy. 

Of course, due to Theorem 6.17, the tableau algorithm can also be used to decide 
satisfiability and subsumption of 5HT<2-concepts w.r.t. a general TBox. To apply the 
algorithm to iSHTQ-knowledge bases, one can either use a pre-completion approach similar 
to the one used to prove Theorem 4.42 (probably with catastrophic effects on the runtime 
of the algorithm), or one can integrate the ABox directly into the tableau algorithm. 
Horrocks, Sattler, and Tobies (2000b) present an algorithm that follows the latter approach. 

We have already mentioned that we do not expect to obtain a worst-case optimal 
solution for iSHTQ-satisfiability from the tableau approach — such an algorithm has already 
been given in order to prove Theorem 6.29. Instead, Algorithm 6.34 is intended as a 
practical decision procedure that can be optimized so that it performs well for reasoning 
tasks occurring in applications. Nevertheless, it is interesting to know how far our tableau 
approach exceeds the worst-case complexity. 

Lemma 6.41 

The SHLQ- algorithm runs in 2-NExpTime. 

Proof. Let C be a iSHTQ-concept and 1Z a role hierarchy. Let m = §clos(C,lZ), k = 
BNRc*,r, and n max be the maximum n that occurs in a qualifying number restriction in 
clos(C,K). If we set n = \C\ + \TZ\, the it holds that m = 0(\C\ ■ \K\) = 0(n 2 ), k = 
0(\C\ + \R\) = 0(n), and n max = 0(2^) = 0(2 n ). In the proof of Lemma 6.35, we have 
shown that paths in a completion tree for C become no longer than 2 2mk and that the 
out-degree of a completion tree is bounded by m ■ n max . Hence, the l SHTQ-algorithm will 
construct a tree with no more than 

(m ■ n max f mk = 0((n 2 • 2^) = 0(2^) = 0{2^) 

nodes. Each node of this tree is labelled with a subset of clos(C,1Z) and each edge is 
labelled with a subset of NR Cj ^. Since every application of a rule either adds a node to the 
tree, a concept or role to one of the labels, or sets the label of an edge to (in which case 
the corresponding successor is blocked forever), the SHLQ- algorithm runs in 2-NExpTime. 
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This seems to be a discouraging result: the tableau algorithm runs in 2-NExpTime 
while the worst-case complexity is only ExpTime. On the other hand, there exist DL sys- 
tems like iFaCT (Horrocks, 1999), which is based on Algorithm 6.34, or RACE (Haarslev 
& Moller, 1999), which is based on a similar algorithm (Haarslev & Moller, 2000a). These 
systems show good performance in system comparisons (Massacci & Donini, 2000) and 
are successfully utilized in a number of applications (e.g., Haarslev & Moller, 2000b; Fran- 
coni & Ng, 2000). This can be explained by the fact that tableau algorithms seem to be 
particularly amenable to optimizations (Baader et al., 1994; Horrocks, 1997; Horrocks & 
Patel-Schneider, 1999; Horrocks & Tobies, 2000; Haarslev & Moller, 2000c). It is these op- 
timizations that cause the good behaviour of implementations based on tableau algorithms 
like Algorithm 6.34. 



Chapter 7 
Guarded Fragments 



The Guarded Fragment of first-order logic, introduced by Andreka, van Benthem, and 
Nemeti (1998), is a successful attempt to transfer many good properties of modal, temporal, 
and description logics to a large, naturally defined fragment of predicate logic. Among 
these are decidability, the finite model property, invariance under an appropriate variant 
of bisimulation, and other nice model theoretic properties (Andreka et al., 1998; Gradel, 
1999b). 

The Guarded Fragment (GF) is obtained from full first-order logic through relativiza- 
tion of quantifiers by so-called guard formulas. Every appearance of a quantifier in GF 
must be of the form 

3y(a(x, y) A 0(x, y)) or Vy(a(x, y) -> 0(x, y)), 

where a is a positive atomic formula, the guard, that contains all free variables of 0. This 
generalises quantification in description, modal, and temporal logics, where quantifica- 
tion is restricted to those elements reachable via some accessibility relation. For exam- 
ple, in DLs, quantification occurs in the form of existential and universal restrictions like 
Vhas_child.Rich, which expresses that those individuals reachable via the role (guarded 
by) has_child must be rich. 

By allowing for more general formulas as guards while preserving the idea of quantifi- 
cation only over elements that are close together in the model, one obtains generalisations 
of GF which are still well-behaved in the above sense. Most importantly, one can obtain 
the loosely guarded fragment (LGF) (van Benthem, 1997) and the clique guarded fragment 
(CGF) (Gradel, 1999a), for which decidability, invariance under clique guarded bisimula- 
tion, and some other properties have been shown in (Gradel, 1999a). 

Guarded fragments have spawned considerable interest in the DL community, mainly 
for two reasons. On the one hand, many DLs can be embedded into suitable guarded 
fragments, which allows the transfer, e.g., of decidability results from guarded logics to 
DLs. Goncalves and Gradel (2000) prove decidability of the guarded fragment yuACGFI, 
which, among other DLs, allows a simple embedding of ACCQE and ACCQJb, proving the 
decidability of these logics. On the other hand, guarded fragments generalise DLs and 
add expressive power that is not present in classical DLs, but interesting for knowledge 
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representation. For example, Lutz, Sattler, and Tobies (1999) present a restriction of GF 
that strictly contains ACCL and allows for n-ary relations instead of the binary roles of 
most DLs. 

GF, LGF, and CGF are decidable and known to be 2-ExpTime complete, which is 
shown by Gradel (1999a, 1999b) using game and automata-based approaches. For these 
guarded fragments, the automata approach has the same problems as it has for modal and 
description logics: it is unclear how to turn the automata decision procedures into efficient 
implementations and a naive implementation has every-case exponential complexity, which 
makes it unusable for applications. So, while these approaches yield (worst-case) optimal 
complexity results for many logics, they appear to be unsuitable as a starting point for 
an efficient implementation. As we have seen, many decidability results for modal or 
description logics are based on tableau algorithms and some of the fastest implementations 
of modal satisfiability procedures are based on tableau calculi (Horrocks, 2000; Patel- 
Schneider, 2000). Unlike automata algorithms, the average-case behaviour in practice is 
so good that finding really hard problems to test these implementations has become a 
problem in itself (Horrocks et al., 2000). In this chapter, we generalise the principles of 
the tableau algorithms encountered in this thesis to develop a tableau algorithm for CGF. 

Recall the conjecture by Vardi that the tree model property is the main reason for the 
decidability of many modal style logics (Vardi, 1996). As pointed out in (Gradel, 1999b), 
the generalised tree model property explains the similarly robust decidability of guarded 
logics, and can be seen as a strong indication that guarded logics are a generalisation of 
modal logics that retain the essence of modal logics. This becomes even more evident 
when regarding the respective fixed-point extensions (Gradel & Walukiewicz, 1999) and 
is the foundation of general decidability results for guarded logics via reduction to the 
modal /x-calculus and the monadic theory of countable trees (SuoS) (Gradel, 2001). The 
generalised tree model property of CGF is also essential for our tableau algorithm. Indeed, 
as a corollary of the constructions used to show the soundness of our algorithm, we obtain 
an alternative proof for the fact that CGF has the generalised tree model property. 

7.1 Syntax and Semantics 

For the definitions of GF and LGF we refer the reader to (Gradel, 1999a). The clique 
guarded fragment CGF of first-order logic can be obtained in two equivalent ways, by 
either semantically or syntactically restricting the range of the first-order quantifiers. In 
the following we will use bold letters to refer to tuples of elements of the universe (a, b, . . . ) 
resp. tuples of variables (x, y, . . . ). 

Definition 7.1 (Semantic CGF) 

Let r be a relational vocabulary. For a r-structure 21 with universe A, the Gaifman graph 
of 21 is defined as the undirected graph G(2l) = (A, E^) with 

= {(a, a') : a', there exists R G r and 

a G i? 21 which contains both a and a'}. 
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Under clique guarded semantics we understand the modification of standard first-order 
semantics, where, instead of ranging over all elements of the universe, a quantifier is re- 
stricted to elements that form a clique in the Gaifman graph, including the binding for 
the free variables of the matrix formula. More precisely, let 21 be a r-structure and p an 
environment mapping variables to elements of A. We define the model relation inductively 
over the structure of formulas as the usual FO semantics with the exception 

21, p |= Vy.0(x, y) iff, for all a & A, such that 
p(x) U {a} forms a clique in G(2l), 

it is the case that 21, p[x t— > a] \= <fi , 

and a similar definition for the existential case. With CGF we denote first-order logic 
restricted to clique guarded semantics. o 

Definition 7.2 (Syntactic CGF) 

Let r be a relational vocabulary. A formula a is a clique-formula for a set x C free(o;) 
if a is a (possibly empty if x contains only one variable) conjunction of atoms (excluding 
equality statements) such that each two distinct elements from x coexist in at least one 
atom, each atom contains at least an element from x, and each element from free(o;) \ x 
occurs exactly once in a. In the following, we will identify a clique-formula a with the set 
of its conjuncts. 

The syntactic CGF is inductively defined as follows. 

1. Every relational atomic formula Rx ix . . . x im or Xi = Xj belongs to CGF. 

2. CGF is closed under Boolean operations. 

3. If x, y, z are tuples of variables, a(x, y, z) is a clique-formula for x U y and 0(x, y) is 
a formula in CGF such that free(0) CxUy, 

then 3yz.(o;(x, y, z) A 0(x, y)) 
and Vyz.(a(x,y,z) -> 0(x,y)) 

belong to CGF. 

We will use (3yz.a(x, y, z))0(x, y) and (Vyz.a(x, y, z))0(x, y) as alternative notations 
for 3yz.(a(x, y, z) A 0(x, y)) and Vyz.(a(x, y, z) — > 0(x, y)), respectively. A formula of 
the form Vyz.(a(x, y, z) — > 0(x, y)) is called universally quantified. o 

The following Lemma can be shown by elementary formula manipulations that exploit 
that every z G z occurs exactly once in a. 

Lemma 7.3 

Let a(x, y, z) be a clique-formula for x, y. Then 



Vyz.(a(x,y,z) -> 0(x,y)) = Vy.(3z.a(x, y, z) -> 0(x,y)). 
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The use of the name CGF for both the semantic and the syntactic clique guarded 
fragment is justified by the following Lemma. 

Lemma 7.4 

Over any finite relational vocabulary the syntactic and semantic versions of the CGF are 
equally expressive. 

Proof sketch: By some elementary equivalence transformations, every syntactically clique 
guarded formula can be brought into a form where switching from standard semantics to 
clique guarded semantics does not change its meaning. Conversely, for any finite signature 
there is a finite disjunction clique(~x,y,z) of clique-formulas for x, y such that a, b form a 
clique in iff 21 |= 3z.clique(si, b, z). By guarding every quantifier with such a formula 

and applying some elementary formula transformations and Lemma 7.3, we get, for every 
FO formula ip, a syntactically clique guarded formula that is equivalent to ip under clique 
guarded semantics. If we fix a finite relational vocabulary, this transformation is polynomial 
in the number of variables of the formula, or, more precisely, the maximal number of free 
variables of all sub- formulas. ■ 

In the following we will only consider the syntactic variant of the clique guarded frag- 
ment. 

Definition 7.5 (NNF, Closure, Width) 

In the following, all formulas are assumed to be in negation normal form (NNF), where 
negation occurs only in front of atomic formulas. Every formula in CGF can be transformed 
into NNF in linear time by pushing negation inwards using DeMorgan 's law and the duality 
of the quantifiers. 

For a sentence ip G CGF in NNF, let clos(ip) be the smallest set that contains ip and is 
closed under sub-formulas. Let C be a set of constants. With clos(ip, C) we denote the set 

clos(i),C) = {0(a) : a C C, 0(x) G clos(ip)}. 

The width of a formula ip G CGF is defined by 

width(-0) := max{|free(0)| : G closfy)}. 

o 

7.2 Reasoning with Guarded Fragments 

Many of the approaches for decision procedures for description and modal logics described 
in Section 3.1 have been successfully applied for CGF: Gradel (1999a) shows decidability of 
(a fixed point extension of) CGF using translation to the monadic second-order theory of 
countable trees SuS (Rabin, 1969) and to the modal /i-calculus with backwards modalities 
(Vardi, 1998). Also in (Gradel, 1999a), it is shown that CGF is 2-ExpTlME-complete and 
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ExpTlME-complete for sentences of bounded width, where the upper bound is based on 
a reduction to emptiness of alternating two-way automata(Vardi, 1998). Resolution based 
decision procedures for guarded fragments are described in (Ganzinger & de Nivelle, 1999; 
de Nivelle & de Rijke, 2000) where the approach in (Ganzinger & de Nivelle, 1999) can 
be extended to CGF. Finally, a tableau decision procedure for a fragment of GF is given 
in (Marx, Schlobach, & Mikulas, 2000). Yet, to the best of our knowledge, there does 
not exist a tableau decision procedure that is capable of deciding the full GF, let alone 
CGF. In the following, we will supply such an algorithm. As it turns out, the algorithm 
as well as the proof of its correctness mainly employ ideas we have already encountered in 
the algorithms and proofs for DLs in this thesis — another indication for the modal nature 
of CGF, since it is amenable to the same techniques successfully used for description and 
modal logics. 

Let us briefly recall the main "ingredients" of tableau algorithms for modal or descrip- 
tion logics like the ones encountered in this thesis. Satisfiability of a concept C is decided 
by a syntactically guided search for a model for C. Models are usually represented by 
a graph in which the nodes correspond to elements and the edges correspond to the role 
relations in the model. Each node is labelled with a set of concepts that this node must 
satisfy, and new edges and nodes are created as required by existential restrictions. Since 
many modal and description logics have the tree model property, the graphs generated 
by these algorithms are trees, which allows for simpler algorithms and easier implemen- 
tation and optimization of these algorithms. Indeed, some of the fastest implementations 
of modal or description logics satisfiability algorithms use tableau calculi (Horrocks, 2000; 
Patel-Schneider, 2000). 

For many modal or description logics, e.g. K or ACC, termination of these algorithms 
is due to the fact that the nesting of universal or existential restrictions of the concepts 
appearing at a node strictly decreases with every step from the root of the tree (e.g., 
compare Lemma 3.4). For other logics, e.g., K4, K with the universal modality, or the DLs 
SI and SHZQ, this is no longer true and termination has to be enforced by other means. 
One possibility for this is blocking, i.e., stopping the creation of new successor nodes below 
a node v if there already is an ancestor node w that is labelled with similar concepts as 
v (e.g., compare Lemma 6.6). Intuitively, in this case the model can fold back from the 
predecessor of v to w, creating a cycle. Unraveling of these cycles recovers an (infinite) 
tree model. Since the algorithms guarantee that the concepts occurring in the label of the 
nodes stem from a finite set (usually the sub-concepts of the input concept), every growing 
path will eventually contain a blocked node, preventing further growth of this path and 
(together with a bound on the degree of the tree) ensuring termination of the algorithm. 

7.2.1 Tableau Reasoning for CGF 

Our investigation of a tableau algorithm for CGF starts with the observation that CGF 
also has some kind of tree model property. 
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Definition 7.6 

Let t be a relational vocabulary. A r-structure 21 has tree width k if k G N is minimal 
with the following property. 

There exists a directed tree T = (V, E) and a function f : V — > 2 A such that 

• for every v G V, \f(v)\ < k + 1, 

• for every R E t and a G -R 21 , there exists v £ V with a C /(f), arid 

• for every a G A, the set V a = {v E V : a G /(f)} induces a subtree ofT. 

Every node v ofT induces a substructure $(v) C 21 of cardinality at most k + 1. The tuple 
(T, (#(i>))i, e T) JS ca ^ e< ^ a tree decomposition of 21. 

A logic C has the generalised tree model property if there exists a computable function 
t, assigning to every sentence ip G C a natural number t(ip) such that, if ip is satisfiable, 
then if) has a model of tree width at most t(ip). o 

Fact 7.7 (Tree Model Property for CGF) 

Every satisfiable sentence ip G CGF of width k has a countable model of tree width at 
most k — 1. 

This is a simple corollary of (Gradel, 1999a, , Theorem 4), where the same result is 
given for ^uCGF, that is CGF extended by a least fixed point operator. 

Fact 7.7 is the starting point for our definition of a completion tree for a formula 
ip G CGF. A node v of such a tree no longer stands for a single element of the model (as in 
the modal case), but rather for a substructure $(v) of a tree decomposition of the model. 
To this purpose, we label every node v with a set C(i>) of constants (the elements of the 
substructure) and a subset of cl(ip,C(v)), reflecting the formulas that must hold true for 
these elements. 

To deal with auxiliary elements — elements helping to form a clique in G(2l) that are not 
part of this clique themselves — we will use the auxiliary constant symbol * as a placeholder 
for unspecified elements in atoms. The intention is to keep the number of constants at each 
node as small as possible. The * will be used for the extra elements occurring in clique 
formulas that are not part of the clique itself. 

The following definitions are useful when dealing with these generalised atoms. 

Definition 7.8 

Let K denote an infinite set of constants and * G" K. For any set of constants C C K 
we set C* = C U {*}. We use ti, t 2 , • • • to range over elements of K* . The relation >* is 
defined by 

Rh... t n >* Rt[ ...t' n iff for alii G {1 . . . n} either U = * or U = t\. 
For an atom (3 and a set of formulas $ we define (3 G* $ iff there is a (3' G $ with (3 >* (3'. 
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For a set of constants C C K and an atom (3 = Rt\ . . . t n , we define 

PmC = Rt[...t' n where t{ = \ U lfUeC > 

* otherwise. 



o 



We use the notation a* to indicate that the tuple a* may contain *'s. Obviously, >* is 
transitive and reflexive, and (3 ffl C >* (3 for all atoms (3 and sets of constants C. 

While these are all syntactic notions, they have a semantic counterpart that clarifies 
the intuition of * standing for an unspecified element. Let a' denote the tuple obtained 
from a tuple a* by replacing every occurrence of * in a* with a distinct fresh variable, and 
let z be precisely the variables used in this replacement. For an atom (3, we define 

21 h /5(a*) iff 21 h 3z./3(a'). 

It is easy to see that 

(3(a) >* (3(b) implies (3(b) \= (3(a), and 
(3(a) G* $ implies <3> |= (3(a) 

because, if a >* b, then b is obtained from a by replacing some * with constants, which 
provide witnesses for the existential quantifier. 

We further write $|c to denote the subset of $ containing all formulas that only use 
constants in C. 

Algorithm 7.9 (The CGF-algorithm) 

Let ip G CGF be a closed formula in NNF. A completion tree T = (V, E, C, A, N) for ip is 
a node labelled tree (V, E) with the labelling function C labelling each node v G V with a 
subset of K, A labelling each node v G V with a subset of cl(ip, C(v )*) where all formulas 
/3(x, *,...,*) G A(v ) using * are atoms (excluding equality statements) , and the function 
N : V — > N mapping each node to a distinct natural number, with the additional property 
that, if v is an ancestor of w, then N(v) < N(w). 

A constant c G K is called shared between two nodes v 1: v 2 G V, if c G C(t>i) fl C(v 2 ), 
and c G C(w) for all nodes w on the (unique, undirected, possibly empty) shortest path 
connecting V\ to v 2 ■ 

A node v G V is called directly blocked 1 by a node w G V, if w is not blocked, N(u>) < 
N(t> ), and there is an infective mapping n from C(v) into C(w) such that, for all constants 
c G C(v) that are shared between v and w, n(c) = c, and n(A(v)) = A(w)\ 7T ^c( v )*)- Here 
and throughout this thesis we use the convention n(*) = * for every function n that verihes 
a blocking. 

A node is called blocked if it is directly blocked or if its predecessor is blocked. 
A completion tree T contains a clash if there is a node v G V such that 

1 The definition of blocking is recursive. Like for the 51- and the 5HTQ-algorithm, this does not cause 
any problems because the status of a node v only depends on its label and the status of nodes w with 
N(w) < N(u). The recursion terminates at the root node, where the N- value is minimal. 
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• for a constant c G C(v), c ^ c G A(v ), or 

• there is an atomic formula j3 and a tuple a C C(v ) such that {/9(a), -i/9(a)} C A(t>). 

Otherwise, T is called clash-free. A completion tree T is called complete if none of the 
completion rules given in Figure 7.1 can be applied to T. A complete and clash- free 
completion tree for ip is called a tableau for ip. 

To test ip for satisfiability, the tableau algorithm creates an initial tree with only a 
single node vo, A(-y ) = {ip} and C(t>o) = {«o} for an arbitrary constant ao- The rules from 
Figure 7.1 are applied until either a clash occurs, producing output "ip is not satisfiable" , 
or the tree is complete, in which case "ip is satisfiable" is output. o 

The set C(t> ) is initialized with a non-empty set of constants to make sure that empty 
structures are excluded. For a discussion of the different kinds of non-determinism that 
occur in the CGF-algorithm, see below Lemma 7.12. 

While our notion of tableaux has many similarities to the tableaux appearing in (Gradel 
& Walukiewicz, 1999), there are two important differences that make the version used here 
more suitable as basis for a tableau algorithm. We will see that every completion tree gen- 
erated by the tableau algorithm is finite. Conversely, tableaux in (Gradel & Walukiewicz, 
1999), in general, can be infinite. Also, in (Gradel & Walukiewicz, 1999) every node is 
labelled with a complete (tp, C(v))-type, i.e., every formula <fi G clos (ip, C(v)) is explicitly 
asserted true of false at v. Conversely, a completion tree contains only assertions about 
relevant formulas. This implies a lower degree of non-determinism in the algorithm, which 
is important for an efficient implementation. 

7.2.2 Correctness 

The techniques used to establish correctness of the CGF-algorithm bear a strong resem- 
blance to the techniques we have employed for the tableau algorithms for description logics 
in the previous chapters. Extra complexity is added by the fact that completion trees for 
CGFare more complex objects than the completion trees for Description Logics, mainly 
because each node now stands for a substructure rather than for a single element of the 
model. 

Termination 

The following technical lemma is a consequence of the completion rules and the blocking 
condition. 

Lemma 7.10 

Let ip G CGF be a sentence in NNF with = n, width (ip) = m, and T a completion tree 
generated for ip by application of the rules in Figure 7.1. For every node v in T, 

1. \C(v)\ < m, 
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2. \A(v)\ <nx (m + l) m , and 

3. any £ > 2 nx ( m+1 ) m distinct nodes in T contain a blocked node. 

Proof. Nodes are only generated when initializing the tree (with a single constant) and 
by the — >g-rule and no constants are added to a C(v) once v has been generated (but some 
may be removed by application of the — >=-rule). 

When triggered by the formula (3yz.a(a, y, z))0(a, y), the — >3-rule initializes C(w) 
such that it contains a and another constant for every variable in x and y. Hence, 

\C(w)\ < |aUy Uz| < |free(a)| < width(^). 

The set A(v) is a subset of cl(ip, C(v)*), for which \cl(ip, C(v))\ < n x (m + l) m holds 
because there are at most n formulas in cl(ip), each of which has at most m free variables. 
There are at most (|C(i>)| + l) m distinct sequences of length m with constants from C(v)*. 



Figure 7.1 The completion rules for CGF 
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A(v) - A A(i;)U{0,0} 

V 9 e A(v) and 
{0, 6} n A(v) = 

A(v) -> v A(t>) U {x} for some x G {0, 6»} 
a = 6 G A(u) and a/i) 

for all u> that share a with v, C(w) — > = (C(w) \ {a}) U {6} 
and A(u>) — >= AH [a i— > 6] 

(Vyz.a(a, y, z))0(a, y) G A(u), and 

there exists b C C(v) such that for all /3(x, y, z) G a, /?(a, b, *•••*) G* A(t>), and 

0(a,b)£A(v) 

A(v) -vA(»)uWa,b)} 

(3yz.a(a,y,z))0(a,y) G A(u), and 

for every b, c C C(v), {a(a, b, c), 0(a, b)} <2 A(v), and 

there is no child w of v with {a(a, b, c), 0(a, b)} C A(w) for some b, c C C(w), and 
v is not blocked 

V — >g V U {w}, E^ 3 EU {(v,w)} for a fresh node w 

let b, c be sequences of distinct and fresh constants that match 

the lengths of y, z and set 

C(w) = aUbUc, and 

A(w) = {a(a, b, c), 0(a, b)}, and 

N(io) = 1 + max{N(w) : v G V \ {w}} 

/3(a*) G A(v), /3 atomic, not an equality, and 
w is a neighbour of v with a* n C(w) ^ 0, and 

A(w) ->j A(to) U {/?(a*) ffl C(w)} 

0(a) G A(v), (f>(a) is universally, quantified, and 
w is a neighbour of v with a C C(w), and 
0(a) AH 

A( w ) ->jv A(w) U {0(a)} 
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Let vi,...,ve be £ > 2 nx ( m+1 ) m distinct nodes. For every v i: we will construct an 
injective mapping 7Tj : C(i>j) — > {1, . . . m} such that, if a constant a is shared between two 
nodes Vi,Vj, then 7Tj(a) = 7Tj(a). 

Let ui, • • • , Mfc denote the nodes of a subtree of T that contains every node Vi and that 
is rooted at u±. By induction over the distance to ui, we define an injective mapping 
Vi : C(-Uj) — > {1, . . . , m} for every i G {1, . . . , k} as follows. For u x we pick an arbitrary 
injective function from C(«i) to {1, . . . ,m}. For a node Ui let itj be the predecessor of Ui 
in T and uj the corresponding function, which has already been defined because Uj has a 
smaller distance to u± than Ui. For z/j we choose an arbitrary injective function such that 
z/j(a) = Vj(a) for all a G C(-Uj) fl C{uj). 

All mappings z/j are injective. For any constant a the set V a := {v G V | a G C(t>)} 
induces a subtree of T. If w^-u.,- G V a are neighbours, the definition above ensures z^(a) = 
Uj(a). By induction over the length of the shortest connecting path we obtain the same 
for arbitrary Uj G V a . 

For every node there is a ji such that Vi = and we set 7Tj = i/j.. There are at 
most 2™ x ( m+1 ) m distinct subsets of cl{^, {1, . . . , m, *}). Hence, there must be two nodes 
Vi,Vj such that 7Tj(A(fj)) = 7ij(A(vj)) and, w.l.o.g., N(fj) < N(t>j). This implies that 
fj is blocked by v-i via n := n^ 1 o 7Tj. Note that for n to be well-defined, 7Tj must be 
injective. By construction, n preserves shared constants. Since 7Tj(A(fj)) = iTj(A(vj)), 
■K(A(vj)) = A(fj)| 7r(C (^-)) holds. ■ 

Lemma 7.11 (Termination) 

Let ip G CGF be a sentence in NNF. Any sequence of rule applications of the tableau 
algorithm starting from the initial tree terminates. 

Proof. For any completion tree T generated by the tableau algorithm, we define || • || : 
V ^ N 3 by 



The lexicographic order -< on N 3 is well-founded, i.e. it has no infinite decreasing chains. 
Any rule application decreases \\v\\ w.r.t. -< for at least one node v, and never increases \\v\\ 
w.r.t. -< for an existing node v. However it may create new successors, one at a time. Since 
-< is well-founded, there can only be a finite number of applications of rules to every node 
in T and hence a finite number of successors and an infinite sequence of rule applications 
would generate a tree of infinite depth. 

Yet, as a corollary of Lemma 7.10, we have that the depth of T is bounded by 2 nx ( m+1 ) m . 
For assume that there is a path of length > 2™ x(m+1 ) m in T with deepest node v. By the 
time v has been created (by an application of the ^^-rule to its predecessor u), the path 
from the root of T to u contains at least 2 nx( - m+1 > nodes, and hence a blocked node. This 
implies that u is blocked too, and the -^3-rule cannot be applied to create v. ■ 




(|C(u)|, nx (m+l) m - \A(v)\, 
\{4> G A(v) : (p triggers the — >3-rule for v}\). 
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Completeness 
Lemma 7.12 

Let ip G CGF be a closed formula in NNF. If ip is satisfiable, then there is a sequence of 
rule applications starting from the initial tree that yields a tableau. 

Proof. Since ip is satisfiable, there is a model 21 of ip. We will use 21 to guide the 
application of the non-deterministic ^ v - r ule. For this we incremently define a function 
9 : Ui c ( v ) \ v E V} ^ A such that for all v G V : 21 |= g(A(v)). We refer to this 
property by (§). 

The set A(t>) can contain atomic formulas a(a*), where * occurs at some positions of 
a*. The constant * is not mapped to an element of A by g. We deal with this as described 
just after Definition 7.8 by setting 

2th<?(a(a*)) iff 21 Mz.<?Ha')). 

Claim 7.13 If, for a completion tree T, there exists a function g, such that (§) holds and 
a rule is applicable to T, then it can be applied in a way that maintains (§). 

• For the — > A - and the — > v -rule this is obvious. 

• If the ^ = -rule is applicable tow G V with a = b G A(v ), then, since 21 |= g(a) = g(b), 
g(a) = g(b) must hold. Hence, for every node w that shares a with v, g(A(w)) = 
g(A(w)[a I— > b]) and the rule can be applied without violating (§). 

• If the ^y-rule is applicable to v G V with (Vyz.a(a, y, z))0(a, y) G A(t>) and 
b C C(f) with /3(a, b, * • • • *) G* A(v) for all atoms /3(x, y,z) G a, then, from the 
definition of G*, there is a tuple c* C C(v)*, such that /3(a, b, *•••*) >* /3(a, b, c*) 
and /3(a, b, c*) G A(t> ). From (§) we get that 21 |= 3z.f3(g(a), g(h), z) and since every 
z E z appears exactly once in a, also 21 |= 3z.a:(g(a), g(b), z). Hence, we have 

{21 |= {Vyz.a(#(a), y, z)) -> 0(^(a), y), 3z. a(g (a), g(b), z)}, 

which, by Lemma 7.3, implies 21 |= 0(^(a), g(b)) and hence 0(a, b) can be added to 
A(t>) without violating (§). 

• If the -^a-rule is applicable to v G V with (3yz.o;(a, y, z))0(a, y), then this implies 

21 h #((3yz.a(a,y,z))0(a,y)). 

Hence, there are sequences b',c' C A such that 21 |= {a(g(sL), h', c'), <f)(g(sL), b')}. If 
we define (7 such that g(b) = b' and g{c) = c', then 21 |= {g(a(si, b, c), g(0(a, b))}. 
Note, that this might involve setting g(&i) = g(b 2 ) for some 6 1; 6 2 G b. With this 
construction the resulting extended completion-tree T and extended function g again 
satisfy (§). 
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• If the — >j-rule is applicable to v G V with /3(a*) G A(t>) and a neighbour w with 
a*nC(«j) ^ 0, then it adds /3(a*)fflC(iu) to A(iu). From (§) we get that 21 |= /3(#(a*)), 
and since /?(b*) := /3(a*) ffi C(w) >* /3(a*), this implies 21 |= P(g(b*)). Hence, adding 
/3(a*) ffl C(w) = (3(h*) to A(iu) does not violate (§). 

• If the — ^jv _ru l e is applicable to a node v G V with a universally quantified formula 
0(a) G A(t>) and a neighbour u> which shares a with v, (§) yields 21 |= 0(g(a)). 
Hence, adding 0(a) to A(w) does not violate (§). 

Claim 7.14 A completion-tree T for which a function g exists such that (§) holds is clash 
free. 

Assume that T contains a clash, namely, there is a node v G V such that either 
a / a 6 V(t>) — implying 21 |= g(a) ^ g(a) — , or that there is a sequence a C C(t>), and 
an atomic formula (3 such that {/9(a), -i/3(a)} C A(u). From (§), 21 |= {/3(#(a)), -i/3(^(a))} 
would follow, also a contradiction. 

These claims yield Lemma 7.12 as follows. Let T be a tableau for ip. Since 21 |= ip, 
(§) is satisfied for the initial tree together with the function g mapping a to an arbitrary 
element of the universe of 21. By Lemma 7.11, any sequence of applications is finite, and 
from Claim 7.13 we get that there is a sequence of rule- applications that maintains (§). By 
Claim 7.14, this sequence results in a tableau. This completes the proof of Lemma 7.12. 



Lemma 7.12 involves two different kinds of non-determinism, namely, the choice which 
rule to apply to which constraint (as several rules might be applicable simultaneously), 
and which disjunct to choose in an application of the — > v -rule. While the latter choice is 
don't-know non-deterministic, i.e., for a satisfiable formula only certain choices will lead to 
the discovery of a tableau, the former choice is don't-care non-deterministic. This means 
that arbitrary choices of which rule to apply next will lead to the discovery of a tableau for 
a satisfiable formula. For an implementation of the tableau algorithm this has the following 
consequences. Exhaustive search is necessary to deal with all possible expansions of the 
^v- r ule, but arbitrary strategies of choosing which rule to apply next, and where to apply 
it, will lead to a correct implementation, although the efficiency of the implementation will 
strongly depend on a sophisticated strategy. 

Soundness 

In order to prove the correctness of the tableau algorithm we have to show that the existence 
of a tableau for ip implies satisfiability of tp. To this purpose, we will construct a model 
from a tableau. From the construction employed in the proof we obtain an alternative 
proof of Fact 7.7. 

Lemma 7.15 

Let ip G CGF[r] with k = width (-0) and let T be a tableau for ip generated by the tableau 
algorithm. Then ip is satishable and has a model of tree width at most k — 1. 
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Proof. Let T = (V, E, C, A, N) a tableau for ip. For every direct blocking situation we fix 
a mapping n verifying this blocking. Using an unraveling construction, we will construct a 
model 21 for if) of width at most k — 1 from T. First, we "unravel" blocking situations in T 
by successively replacing every blocked node with a copy of the subtree of T rooted at the 
blocking node. Formally, this is achieved by the following path construction. We define 

V M = {v G V : v is not blocked or directly blocked }. 

Since from now on we only deal with nodes from V n , every blocking is direct and we will 
no longer explicitly mention this fact. 

The set Paths(T) is inductively defined by 2 

• [{£] G Paths(T) for the root v of T, 

. . . —7-] G Paths(T), the node w is a successor of v n and w is not blocked, then 
G Paths(T), 

. . . ^r] G Paths(T), w is a successor of v n blocked by the node u G V, then 
. G Paths(T). 

The set Paths(T) forms a tree, with p' being a successor of p if p' is obtained from p 
by concatenating one element — at the end. We define the auxiliary functions Tail, Tail' by 
setting Tail(p) = v n and Tail'(p) = v' n for every path p — • • • ^f\- 

Intuitively, for every node v of T, the paths p G Paths(T) with v = Tail(p) stand 
for distinct copies of v created by the unraveling. The universe of 21 consists of (classes 
of) constants labelling nodes in T paired with the paths at whose Tail they appear to 
distinguish constants occurring at different copies of a node of T. Formally, we define 

C(T) = {(a,p) : p G Paths(T) A a G C(Tail(p))}. 

Constants appearing at consecutive nodes of T stand for the same element and the same 
holds for constants related by a mapping n verifying a block. Hence, to obtain the universe 
of 21, we factorize C(T) as follows. Let ~ be the smallest symmetric relation on C(T) 
satisfying 

• (a,p) ~ (a,q) if q is a successor of p in Paths(T), Tail'(g) is an unblocked successor 
of Tail(p), and a G C(Tail(p)) n C(Tail'(g)), 

• (a,p) ~ (b, q) if q is a successor of p in Paths(T), Tail'(g) is a blocked successor of 
Tail(p), a G C(Tail(p)) flC(Tail'(g)), and n(a) = b for the function n that verifies that 
Tail'(g) is blocked by Tail(g). 

2 This complicated form of unraveling, where we record both blocked and blocking node is necessary 
because there might be a situation where two successors vi , v 2 of a node are directly blocked by the same 
node w. 



if n 



if n 

v i 



150 



Chapter 7. Guarded Fragments 



With ps we denote the reflexive, transitive closure of ~ and with [a,p]~ the class of (a,p), 
i.e., the set {(b,q) G C(T) | (b,q) ps (a,p)}. Since (a,p) ~ (6, q) iSp,q are neighbours in 
Paths(T), for every (a,p), the set 

Paths([a,p]„) := {q | 36.(6, g) G [a,p]«} 

is a subtree of Paths(T). 

The classes of C(T)/ ps will be the elements of the universe of 21. First we need to 
prove some technicalities for this construction. 

Claim 7.16 Let p E Paths(T) and a,b e C(Tail(p)). Then (a,p) ps (b,p) iff a = 6. 

Assume the claim does not hold and let a ^ b with (a,p) ~ (b,p). By definition of 
~, (a,p) (6,p) must hold. Hence, there must be a path (ci,pi) ~ ••• ~ (ck,Pk) such 
that a = ci, 6 = c^, and p = p\ = Pk- W.l.o.g., assume we have picked a,b,p such that 
this path has minimal length k. Such a minimal path must be of length k — 3, for if we 
assume a path of length k > 3, there must be2<i<j</c — 1 such that pi = pj, 
because the relation ~ is defined along paths in the tree Paths(T). If q = Cj then we 
can shorten the path between position % and j and obtain a shorter path. If q 7^ c^, then 
the path (ci,pi) ~ • • • ~ (cj,pj) is also a shorter path with the same properties. Hence, 
a minimal path must be of the form (a,p) ~ (c,q) ~ (b,p). If Tail'(g) is not blocked, by 
the definition of ~, a = c = b must hold. Hence, since a 7^ b, Tail'(g) must be blocked by 
Tail (5). From the definition of ~ we have a, b E C(Tail'(g)) and n(a) = c = 7r(6) for the 
function n verifying that Tail'(g) is blocked by Tail(g). Since n must be injective, this is a 
contradiction. 

Since the set Paths(T) is a tree, and as a consequence of Claim 7.16, we get the following. 

Claim 7.17 Let p, q e Paths(T) with p = . . . J], q = . . . If, for a e C(v n ), b e 
C(w), (a,p) ~ (b,q) then (a,p) ~ (b,q). 

If (a,p) ~ (b,q) then there must be a path (ci,pi) ~ • • • ~ (ck,Pk) such that a = Ci, 
b = Ck, p = Pi, and q = p^. Since ~ is only defined along paths in the tree Paths(T), there 
must be a step from p to q (or, dually, from q to p) in this path, more precisely, there must 
be an i e {1, ... A; — 1} such that Pi—p and pi + i = q holds. Hence, we have the situation 

(a,p) ~ (ci,p) ~ (ci+i,g) ps (6,g). 

Claim 7.16 implies a = q and b = q + i and hence (a,p) ~ (6, g). 

Using Claim 7.17, we can show that the blocking condition and the — >j- and — >j v -rule 
work as desired. 

CLAIM 7.18 Let p,q e Paths(T), a C C(Tail(p)),b C C(Tail(g)), a, b non-empty tuples, 
and (a,p) ~ (b, q). 



• For every atom /3, /3(a, *•••*)£* A(Tail(p)) iff /3(b, *•••*)£* A(Tail(g)). 
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• For every universally quantified 0, 0(a) G A(Tail(j»)) iff 0(b) G A(Ta\\(q)). 

Since both propositions are symmetric, we only need to prove one direction. If (a, p) m 
(b, q) with a = aia 2 ■ ■ ■ a m and b = bib 2 . . . b m , then 

m 

{p,q} C f|Paths([ ai ,p]«) 

i=i 

and, as an intersection of subtrees of Paths(T), P|™ 1 Paths([a i ,p] Rj ) is itself a subtree of 
Paths(T). Hence, in Paths(T) there is a path pi,...,p k for which there exist tuples of 
constants c 1 ,...,c k with (ci,pi) « • • • « (c k ,p k ), p = p 1 , q = p k , a = c l5 and b = c fe . 
Since a, b are non-empty, so are the Cj. From Claim 7.17, we get that for any two neighbours 
Pi,p i+ i in Paths(T), (c i: pi) w (c m ,p m ) implies (c;,p;) ~ (c i+ i,p m ). 

By two similar inductions on % with 1 < % < k we show that if /3(a, *•••*) e* A(Tail(p)) 
then /3(cj, *•••*) G* A(Tail(p;)) and if 0(a) G A(Tail(p)) then 0(c;) G A(Tail(c;)). 

For « = 1 in both cases nothing has to be shown. Now assume that the we have shown 
these properties up to %. W.l.o.g., assume pi + i is a successor of Pi in the tree Paths(T). 
The other case is handled dually. There are two possibilities: 

• Tail'(pj + i) is not blocked. Then Tail(p i+ i) = Ta\\'(p i+1 ) and by the definition of ~, 
Tail(p i+ i) is a successor of Tail(pj) in T and Cj = c i+1 holds. 

If /3(a, * • • • *)e*A(Tail(p)) then /3(cj, * • • • *)e*A(Tail(pj)) holds by induction and due 
to the — >|-rule, this implies /3(cj+i, *•••*) e* A(Tail(p i+ i)). The — ^-rule is applicable 
because, for the the non-empty tuple Cj, Cj = c i+1 C C(Tail(p i+1 )) holds. 

If 0(a) G A(Tail(p)) then by induction 0(cj) G A(Tail(pj)) and due to the — ►jy-rule 
this implies 0(c i+ i) G A(Tail(pj+i)). 

• Tail'(pj + i) is blocked by Tail(pi+i) (with function n) and Tail'(pj + i) is a successor of 
Tail(pi) in T. Then, by definition of ~, we have c i+1 = 7r(cj) and Cj C C(Tail(pj)) fl 

C(Tail'(p i+1 ))- 

If /?(a, * • • • *)G*A(Tail(p)) then /3(cj, * • • • *)e*A(Tail(pj)) holds by induction and due 
to the — *j-rule this implies /3(cj, *•••*) G* A(Tail'(p i+ i)). The — >j-rule is applicable 
because, for the non-empty tuple Cj, C C(Tail'(pj+i)) holds. The node Tail(pj+i) 
blocks Tail'(pi + i), which implies 

tt(/3( Ci , *...*)) = /?(c m , *...*) G* A(Tail(p m )). 

If 0(a) G A(Tail(p)) then by induction 0(cj) G A(Tail(pj)) and due to the — >jv-rule 
this implies 0(cj) G A(Tail'(p i+1 )). Since Tail(p i+ i) blocks Tail'(p i+ i), 7r(0(cj)) = 
0(c i+ i) G A(Tail(pj+i)) holds. 

We now define the structure 21 over the universe A = C(T)/». For a relation i? G r of 
arity m, i? 21 is defined to be the set of tuples ([ai,pi]«, • • • , [a m ,p m ]~) for which there exists 
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a path p G Paths(T) and constants c\, . . . c m such that (cj,p) ~ (di,Pi) for all 1 < i < m, 
and Rci . . . c m £ A(Tail(p)). 

It remains to show that this construction yields 21 |= ip. This is a consequence of the 
following claim. 

Claim 7.19 For every path p G Paths(T) and a C C(Tail(p)), if 0(a) G A(Tail(p)), then 
Sl|=0([a,p]«). 

We show this claim by induction on the structure of 0. If 0(a) = Ra\ . . . a m G 
A(Tail(p)), then the claim holds immediately by construction of 21. 

Assume 0(a) = -ii?a G A(Tail(p)), but [a,p]~ G i? 21 . Then, by the definition of 21, there 
must be a path p' and constants c such that (a,p) ~ (c,p') and i?c G A(Tail(p')). From 
Claim 7.18 we have that (a,p) (c,p') implies i?a G* A(Tail(p)) and, since a contains no 
occurrence of *, Rsl G A(Tail(p)). Hence T contains the clash {Rsl, -i_Ra} C A(Tail(p)), a 
contradiction to the fact that T is clash-free. Thus, [a,p]~ G" R % . 

Assume 0(a) = a ^ b G A(Tail(p)) but [a,p]~ = From Claim 7.16 we get that 

this implies a = b and hence T contains the clash a ^ a G A(Tail(p)). Again, this is a 
contradiction to the fact that T is clash-free and [a,p]~ ^ [b,p]^ must hold. 

For positive Boolean combinations the claim is immediate due to the — > A - and -^ v - r ule. 

Let 0(a) = (Vyz.a(a, y, z))x(a, y) G A(Tail(p)) and b, p, c, q arbitrarily chosen with 

9l|=a([a,p]«,[b,pU[c,q]„). (7.1) 

We need to show that also 21 |= x([ a ?£>]«) [b, p]«) holds. In order to bring completeness 
of T and the — >v-rule into play, we show that information about the fact that (7.1) holds 
is present at a single node in T where it triggers the — >v-rule. We rely on the fact that 
universal quantifiers must be guarded. 

Every G y coexists with every other variable yj G y in at least one atom /3( l >fi g 
a(a, y, z) and with every element ag G a in at least one atom 7^'^ G a (a, y,z). For any 
two distinct variables Ui,yj, 21 \= /?( lJ ')([a,p]~, [b, p]~, [c, q]~) holds and this can only be 
the case if there is a path and constants S hj \e^' jS> such that (b i: pi) (c^' j \ q^) 
and (bj,pj) « (d^' j \q^). 

Similarly, for every element [bi,pi}~ G [b, p]~ and every element {ae,p) there ex- 
ists a path r^'^ and constants such that (bi,pi) ~ (f^ l ' e \ r^^) and (a£,p) ~ 
(g( l ' e \ r^). For every % and I, Paths([6i,pi]~) and Paths([a^,p]~) are subtrees of Paths(T). 

The tree Paths([fo i ,p i ] RJ ) overlaps with the tree Paths([6j,p-,]~) at and with the tree 
Paths([a^,p]~) at r^ l,t> . From this it follows (Golumbic, 1980, Proposition 4.7) that there 
exists a common path 

s G p| Paths([6 i ,p i ]„) n p| Paths([a*,p]„). 

i 1 

Thus, there are tuples a', b' such that 



(a',s) w (a,p) and (b',s) « (b,p). 



(7.2) 
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We now show that the preconditions of the — >v-rule are satisfied at Tail(s) for the 
formula (Vyz.a(a / , y, z))%(a', y) and the tuple b'. First, due to Claim 7.18, it holds that 
(Vyz.a(a',y,z))x(a',y) G A(Tail(s)) because (a,p) w (a', s) and (Vyz.a(a, y, z))x(a, y) G 
A(Tail(p)). 

For every /3(x, y,z) G a(x, y,z), /3(a', b', * • • • *) G* A(Tail(s)) holds as follows: from 
(7.1, 7.2) we get 

ahWi ) [b' ) i ) [c,i). 

Since /3 is an atom, this implies the existence of a path t and tuples a", b", c' with 

(a', a) « (a",t), (b',a) « (b",t), (c, q) » (c',t) 
and /3(a",b",c') G A(Tail(*)) 1 J 

Clearly, /3(a",b", * • • • *) >* /3(a", b", c') and, since /3(a",b",c') G A(Tail(f)), it holds 
that /3(a", b", *•••*) G* A(Tail(t)). Thus, by Claim 7.18 it holds that /3(a', b', * • • • *) G* 
A(Tail(s)). 

Since this is true for every atom (3, the preconditions of the — >v-rule are satisfied and 
the completeness of T yields x(a', b') G A(Tail(s)). By induction, 21 |= x([a', s]~, [b', s]~) 
holds and together with (7.2) this implies 21 \= x([a,p]~, [b, p]~). Since a, p, c, q have been 
chose arbitrarily, 21 |= 0([a, p]~) holds. 

If 0(a) = (3yz.a(a, y, z))x(a, y) G A(Tail(p)), there are two possibilities: 

• there are b, c C C(Tail(p)) with {a(a, b, c), x(a, b)} C A(Tail(p)). Then, by induc- 
tion, we have 

21 |= {a([a,p]„, [b,p]„, [c,p]„),x([a,p]„, [b,p]„)} 
and hence 21 |= 0([a,p]~). 

• there are no such b,cC C(Tail(p)), then there is a successor w of Tail(p) and b,cC 
C(w) with {a(a, b, c), x(a, b)} C A(w). The node w can be blocked or not. 

If w is not blocked, then p' = [p, ^] G Paths(T) and by induction 

21 h W[a,p%,[b,p%,[c,p%), X ([a,p%,[b,p%)}. 
From the definition of we have, (a,p') ps (a,p) and hence 21 |= 0([a,p]~). 

If ty is blocked by a node w (with function ir) then p' = [p, ^] G Paths(T). From the 
blocking condition, we have that u is unblocked and 7r{a(a, b, c), x(a, b)}) C A(u). 
Hence, by induction 

21 |= { a([7r(a),p%, [7r(b),p%, [7r(c),p%), 

X([7r(a),p%,[7r(b),p%) }, 

and, by definition of ss, we have that (a,p) p=* (7r(a),p') and hence, 21 |= 0([a,p]~). 

As a special instance of Claim 7.19 we get that 21 |= ip. From Lemma 7.10, we get that, 
for every node v G V, |C(i>)| < width and hence the tree Paths(T) together with the 
function / : Paths(T) — > C(T)/p3 with /(p) = C(Tail(p))/p3 provides a tree decomposition 
of 21 of width < width('0) — 1. This completes the proof of Lemma 7.15. ■ 
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As an aside, together with Lemma 7.12, the construction used to prove Lemma 7.15 
yields an alternative proof of Fact 7.7: 

Corollary 7.20 

CGF, and hence also LGF and GF have the generalised tree model property. 

Proof. Let ip G CGF[r] be satisfiable. Then, from Lemma 7.12 we get that there is 
a tableau T for vp. By Lemma 7.15, T induces a model for ip of tree width at most 
width (ip) — 1. Note that we have never relied on Fact 7.7 to obtain any of the results in 
this thesis and hence have indeed given an alternative proof for the generalised tree model 
property of CGF. For LGF and GF, observe that the embedding of these logics into CGF 
may increase the width of the sentence but not by more than a recursive amount. ■ 

Lemma 7.11, 7.12, and Lemma 7.15 yield correctness of the tableau algorithm for CGF. 
Theorem 7.21 

The tableau algorithm is a decision procedure for CGF '-satisfiability. 

An optimized implementation of this tableau algorithm is part of ongoing work. It will 
be interesting to see if the tableau algorithm is amenable to the same optimizations devel- 
oped for modal or description logic tableau algorithms and how it performs in comparison 
with the resolution based approach from (Ganzinger & de Nivelle, 1999). 



Chapter 8 
Summary 



The two major subjects of this thesis were (i) the worst-case complexity of reasoning with 
expressive description logics, particularly in the presence of counting operators; and (ii) 
the development of practical algorithms for description and guarded logics. This chapter 
summarizes and comments on the main results obtained on these topics. 



Local Counting Qualifying number restrictions introduce a form of counting into DLs, 
which is local because only statements about the number of role successors of an individual 
are expressible. Until now, the impact of qualifying number restrictions on the complex- 
ity of reasoning was unknown, if binary coding of numbers in the input is assumed. In 
this thesis, we have shown that — in terms of worst-case complexity — qualifying number 
restrictions do not lead to a rise in complexity of the reasoning problems. 

Like for ACC, concept satisfiability for ACCQ is PSPACE-complete, even for the case 
of binary coding of numbers in the input (Theorem 4.6). The same applies to ACCQDj 
(Theorem 4.29), which extends ACCQ with inverse roles and safe role expressions, and is 
one of the most expressive DLs for which qualifying number restrictions have been studied. 

For the case of the other inference problems, we have shown (Theorem 4.42) that 
knowledge base satisfiability for ACCQUb is ExpTlME-complete also in the case of binary 
coding of numbers in the input (Theorem 4.42), and hence again has the same complexity 
as the same problem for ACC. 

It is also possible to add qualifying number restrictions to DLs that allow for transitive 
roles and role hierarchies without an increase in worst-case complexity. Concept satisfia- 
bility (with or without general TBoxes) for SHLQ is ExpTlME-hard (Theorem 6.29), also 
if binary coding of numbers in the input is assumed. This matches the complexity of the 
same problems for the DL SH, i.e., the fragment of SHLQ that does not allow for inverse 
roles or number restrictions. To maintain satisfiability of the decision problems for SHZQ, 
it was necessary to allow qualifying number restrictions only over roles that are neither 
transitive nor have transitive sub-roles. In the absence of role hierarchies, the effect of 
number restrictions over transitive roles on complexity and decidability is open. 
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Global Counting If we additionally consider constructors that allow to express global 
counting statements like cardinality restrictions or nominals, 1 then this increases the com- 
plexity of the inference problems, independent on the coding of numbers in the input. 

We have shown that knowledge base satisfiability becomes NExpTlME-hard if cardi- 
nality restrictions are added to ACCQ (Theorem 5.20) or ACCQI (Theorem 5.19), while 
knowledge base satisfiability for ACCQ and ACCQI without cardinality restrictions is Ex- 
pTlME-complete (as a Corollary of Theorem 4.42). The "gap" in the complexity is even 
wider if we consider nominals: the complexity of concept satisfiability rises from PSPACE- 
complete to NExpTlME-hard if nominals are added to ACCQE (Corollary 5.27). 

A special case is the DL ACCQZB, for which nominals or cardinality restrictions can 
be added without a change in the complexity of the inference problems. Yet, a closer 
look shows that cardinality restrictions (and hence nominals) can already be expressed by 
^tCCQZB-concepts (Lemma 5.32), which explains that they do not have an impact on the 
complexity. 

Coding of Numbers One of the recurring themes of the thesis has been the impact that 
coding of numbers in the input has on the complexity of the reasoning problems. With 
respect to this topic, we have obtained only an incomplete picture. 

All our results for local counting are independent on the coding of numbers and one 
of the main contributions of this thesis is the development of algorithms that deal with 
binary coding of numbers in the input without an additional exponential overhead. 

For logics that allow for global counting, we obtain tight complexity result only if unary 
coding of numbers in the input is assumed. This is because the upper (NExpTime-) 
bounds rely on a reduction to C 2 , the two- variable fragment of FOL with counting quanti- 
fiers, for which the exact complexity is also known only for unary case. Of course, reasoning 
does not become easier in the binary case, and so the lower bounds hold independently 
of the coding. For the upper bounds, we only know that all problems can be solved in 
2-NExpTime but we do not have matching hardness-results. It is an interesting open 
question whether exponential blow-up is necessary or whether are algorithms that can deal 
with the binary case without an increase in complexity. 

Until such algorithms are developed (or 2-NExpTlME-hardness is proved), it is open 
if the complexity of these reasoning problems rises when switching from unary to binary 
coding of numbers in the input. The only case for which an increase in complexity is 
certain is ACCQ with cardinality restrictions, for which satisfiability is ExpTlME-complete 
in the unary case (Corollary 5.8) and NExpTlME-hard in the binary case (Theorem 5.20). 

Practical Algorithms The practicality of inference algorithms, i.e., how easily they 
can be implemented and optimized and how they behave on "real world" instances, is 
important for their application in DL systems. In general, tableau algorithms have proven 

1 At first sight, it might look odd that we subsume nominals under global counting. Yet, the requirement 
that nominals must be interpreted by singletons can be seen as a form of global counting and, indeed, 
Lemma 5.5 exhibits a close connection between reasoning with nominals and with cardinality restrictions. 
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to be amenable to a number of powerful optimization techniques and are successfully 
employed in many DL systems. However, a tableau algorithm is not practical just because 
it is a tableau algorithm. One important criterion for the practicality of an algorithm seems 
to be the degree to which is depends on non-deterministic choices because, in a (necessarily 
deterministic) implementation, the different possibilities have to be searched exhaustively 
in order to obtain a complete algorithm. This search is what is responsible for most of the 
runtime of tableau algorithms and most of the aforementioned optimizations aim to reduce 
the size of the search space. 

So, one of the major design principles of the SZ-, SHZQ-, and CGF-algorithm in this 
thesis was to avoid non-deterministic choices as much as possible. The SHZQ-algorithm 
developed in this thesis (Algorithm 6.34), forms the basis of the highly optimized DL system 
iFaCT (Horrocks, 1999) that shows good performance in system comparisons (Horrocks, 
2000) and is successfully applied in applications (see, e.g., Franconi & Ng, 2000). One 
problem of the l SHTQ-algorithm lies in the non-deterministic identification of nodes due to 
its — ><-rule. The development of optimization techniques that specifically deal with this 
problem is part of ongoing work. Moreover, it will be interesting to extend the refined 
blocking strategy developed for the ST-algorithm to SHEQ and implement it in the iFaCT 
system. We have claimed that the tableau algorithm for CGF (Algorithm 7.9) is useful as 
the basis of an efficient reasoner and the implementation of such a system is in progress. It 
will be interesting to see how this implementation performs in comparison with the existing 
decision procedures for guarded fragments based on refinements of general FOL theorem 
proving techniques. 

In the ACCQ- algorithm (Algorithm 3.2) and the ACCQEb- algorithm (Algorithm 4.21), 
we have freely used non-determinism in order to obtain a space-efficient algorithm. This 
implies that they seem to be less suited for an implementation because of their highly 
non-deterministic — >>-rule. 

For the remaining algorithms developed in this thesis, it is at least questionable if they 
can serve as the basis of an efficient implementation. This is especially the case for the de- 
cision procedure used to prove ExpTlME-completeness of concept satisfiability of ACCQEb 
w.r.t. general TBoxes (see Theorem 4.38), which is based on a highly inefficient automata- 
construction. It is even less likely that an efficient decision procedure can be obtained from 
our decision procedures for knowledge base satisfiability for ACCQEb (see Theorem 4.42) or 
from the worst-case optimal decision procedures for SHEQ (see Corollary 6.29 and Corol- 
lary 6.30) because these add a wasteful pre-completion technique and various translations 
on top of the already inefficient algorithm for ACCQJb with general TBoxes. 

All decision procedures for the NExpTlME-hard DLs presented in this thesis employ a 
reduction to C 2 , for which the only known decision procedures work by model enumeration 
and so there exists no decision procedure for these logics that could be of practical use. 
This situation is particularly unsatisfactory for the DL SHEQP, for which such a decision 
procedure would be of high interest due to SHEQP's role for inferences for the semantic web 
(Fensel et al., 2000; Horrocks & Sattler, 2001). Maybe the most intriguing question left 
open by this thesis is how practical decision procedures for NExpTlME-complete modal 
and description logics can be developed. 
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